Data retrieval
The core of retrieving data from Suvvy tables is a special dynamic SQL generator (Dynamic SQL Generator - DSG). It is a mechanism that allows converting client queries into SQL queries to the table, thus enabling the processing of tables of unlimited length without overloading the model context and obtaining the most accurate answers.
What does it allow?
Using tables is usually necessary to obtain:
current prices for goods or services
information about product stocks
list of current offers
dynamic information (sales, promo codes, etc.)
and so on.
Requirements for the table
The main format requirements for the table to connect it correctly are as follows:
The first row of the table must be filled with the column headers of the table.
The table should not contain merged rows (such tables are usually referred to as simple or normalized).
Why is it important that the table is simple (without merging) - so that queries to its data work correctly.
Here's an example of a correct and incorrect table:
Example of an incorrect table
Example of a correct table
Connecting a table
To connect a table, go to the Tables section in the bot settings and choose one of the connection options:
uploading the table from a CSV file
uploading from an Excel file
connecting a Google Sheet
When connecting a table from a CSV file, it is important that the file is saved in UTF-8 encoding with delimiters - commas.
Fundamentally, the connection process is almost identical, except for going through Google authorization when connecting a Google Sheet.
When connecting a static table (CSV / XLS), just select the file.
And if it's about a Google Sheet - you need to authorize and select the required sheet:
After selection, a table creation form will open:
Table Name - name for display in the list.
Sheet - ability to select a Google Sheet sheet.
Table Data Refresh Frequency (minutes) - the time the table is updated for Suvvy's responses. Default is 1440 minutes or once every 24 hours. The refresh button allows you to update the data at any time.
We do not recommend updating more frequently than every 30 minutes because every time the table is re-downloaded, if it is too large, it may overload the Google account and it may limit access to it.
Use - parameter determining whether Suvvy will see the table or not. Default is off.
Function Name - the name of the function by which Suvvy will refer to the table. More about functions in a section about functions.
Table Description - basic information describing the meaning of the table and the data stored in it. Based on the description, Suvvy will understand when to call the table for which queries.
The description is a very important component as it determines the behavior and query, which will be formed to the table.
Preliminary SQL-query - this is a pre-prepared query to the table, which is executed before the neural network query.
Column Parameters
When filling in the table columns, it's important to follow a few rules - the name should be without spaces and should reflect the essence of the values in the column.
The table displays the first 5 rows of the table - just for understanding what data is stored in it.
Search by part of the string - allows finding data in the table even if part of the text from the column was given.
Ignore case - allows not to consider the case of the word (capital or lowercase) and find names written in uppercase letters even if the selection was made with a lowercase word.
Always string - allows working with numbers as if they were strings.
Working with the Table
The first thing to know is that Suvvy understands when to take data from the table based on the table description.
Additionally, there is always an option to refer to the table as a function from the main instruction:
For example, to refer to the table we can use the structure:
"If the client.......call the function
", whereis the specific function name from the table created during table creation.
As seen from the prompt, we first asked the client for the necessary parameters and then passed them into the function for which we need to filter.
When calling the table, if we are debugging (more about debugging here), we can see that Suvvy creates a special SQL query to the table to obtain specific data:
Suvvy creates SQL queries using a special engine (DSG), but it's generally good to understand the principles of SQL query language, if the language is unfamiliar to you, we recommend reading the block below:
Queries for Table SelectionImportant
Since tables can contain quite a large amount of data and in one go Suvvy can receive a multitude of rows from the table, it's very important:
To pre-apply the maximum possible filter to the query to the table.
To apply an artificial restriction on the number of selected data
If we're talking about point 1, you can filter queries by specifying what we need to pass to the table, in the example above, it was filtering by name and color. The color in this case allows narrowing the search.
Example of Numerical Selection
Additionally, we can apply filters to fields with type Date and number.
For example, if we want to specify that Suvvy provides materials only where the quantity is greater than 60, we can do this through the prompt/instruction as follows:
In this case, additional conditions will be added to the SQL query to the table, and we will get a selection considering the additional filter:
Example of Selection Using LIMIT
Sometimes, despite filtering by columns, the selection still turns out to be large, thus overloading context and increasing query cost.
In this case, additionally, we can apply an artificial restriction on the number of records retrieved from the table using the LIMIT operator.
For instance, formulating the prompt in this way:
In this case, the phrase "no more than N records" allows placing a limitation on the table and getting the first N suitable records.
Here's what the table selection returned, with only one record:
In case such a prompt works unstably, you can additionally specify in the text in brackets a hint for the model:
Preliminary SQL QueryThere are cases when in the column values we want to work with specific values. They are known to us beforehand.
For example, the column has values of cities like:
Moscow
Saint Petersburg
Volgograd
Zelenogorsk
Tyumen
But we know that according to the task conditions we need to find data only for the values
Saint Petersburg
Volgograd
Zelenogorsk
Based on what we discussed earlier, we can describe these selections in the prompt, for example:
And this will work, BUT, our task is always to increase the quality of answers and reduce the chance of errors. As we previously described in the section Best Practices the basis for increasing quality is reducing the instruction and where possible using algorithmic mechanisms.
Therefore, here the preliminary filter mechanism for the table comes to our aid, which is specified through the Preliminary SQL Query field. By writing a preliminary query to the table, we immediately get a ready selection only for these cities without the additional need to apply it within the instruction and thereby increasing the context and load on the model:
Working with Cross-tablesThere is a type of tables known as cross-tables or data matrices.
In this table, data search occurs at the intersection of the first column values (usually representing categories) and the remaining columns (representing other data dimensions).
Cross-tables are very convenient for representing information when you need to visualize the relationship between two or more variables, which is often used in statistics and analytics.
For a specific example, imagine you have a table with sales data of products, where the first column is the product names, and the other columns are sales in different regions. Then each value within the table is the sales amount of a specific product in a specific region:
You might wonder, how to make selections in this case, and can Suvvy retrieve data from such tables on its own?
Yes! It can!
Firstly, even if we don’t describe anything forcibly - it will cope with the task using a simple prompt:
The result we will receive is:
As seen, it understood in which columns what is located and applied the correct selections.
However, when it comes to large instructions, and if we're talking about the quality and stability of responses, additional techniques can be used.
We can add the following instruction template within the table description:
A clearly specified template will provide better accuracy in large core instructions.
Last updated