Data retrieval

The core of retrieving data from Suvvy tables is a special dynamic SQL generator (Dynamic SQL Generator - DSG). It is a mechanism that allows converting client queries into SQL queries to the table, thus enabling the processing of tables of unlimited length without overloading the model context and obtaining the most accurate answers.

What does it allow?

Using tables is usually necessary to obtain:

  • current prices for goods or services

  • information about product stocks

  • list of current offers

  • dynamic information (sales, promo codes, etc.)

and so on.

Requirements for the table

The main format requirements for the table to connect it correctly are as follows:

  1. The first row of the table must be filled with the column headers of the table.

  2. The table should not contain merged rows (such tables are usually referred to as simple or normalized).

Why is it important that the table is simple (without merging) - so that queries to its data work correctly.

Here's an example of a correct and incorrect table:

Example of an incorrect table

Example of a correct table

Connecting a table

To connect a table, go to the Tables section in the bot settings and choose one of the connection options:

  • uploading the table from a CSV file

  • uploading from an Excel file

  • connecting a Google Sheet

When connecting a table from a CSV file, it is important that the file is saved in UTF-8 encoding with delimiters - commas.

Fundamentally, the connection process is almost identical, except for going through Google authorization when connecting a Google Sheet.

When connecting a static table (CSV / XLS), just select the file.

And if it's about a Google Sheet - you need to authorize and select the required sheet:

After selection, a table creation form will open:

Table Name - name for display in the list.

Sheet - ability to select a Google Sheet sheet.

Table Data Refresh Frequency (minutes) - the time the table is updated for Suvvy's responses. Default is 1440 minutes or once every 24 hours. The refresh button allows you to update the data at any time.

We do not recommend updating more frequently than every 30 minutes because every time the table is re-downloaded, if it is too large, it may overload the Google account and it may limit access to it.

Use - parameter determining whether Suvvy will see the table or not. Default is off.

Function Name - the name of the function by which Suvvy will refer to the table. More about functions in a section about functions.

Table Description - basic information describing the meaning of the table and the data stored in it. Based on the description, Suvvy will understand when to call the table for which queries.

The description is a very important component as it determines the behavior and query, which will be formed to the table.

Preliminary SQL-query - this is a pre-prepared query to the table, which is executed before the neural network query.

Column Parameters

When filling in the table columns, it's important to follow a few rules - the name should be without spaces and should reflect the essence of the values in the column.

The table displays the first 5 rows of the table - just for understanding what data is stored in it.

Search by part of the string - allows finding data in the table even if part of the text from the column was given.

Ignore case - allows not to consider the case of the word (capital or lowercase) and find names written in uppercase letters even if the selection was made with a lowercase word.

Always string - allows working with numbers as if they were strings.

Working with the Table

The first thing to know is that Suvvy understands when to take data from the table based on the table description.

Additionally, there is always an option to refer to the table as a function from the main instruction:

For example, to refer to the table we can use the structure:

"If the client.......call the function

", whereis the specific function name from the table created during table creation.

#INFORMATION ON BUILDING MATERIALS
If the client asks about the cost of building materials do the following actions:
1. Be sure to ask the client the name of the material and the color.
2. Then call the function list_of_materials, and pass the material and color there.

As seen from the prompt, we first asked the client for the necessary parameters and then passed them into the function for which we need to filter.

When calling the table, if we are debugging (more about debugging here), we can see that Suvvy creates a special SQL query to the table to obtain specific data:

Suvvy creates SQL queries using a special engine (DSG), but it's generally good to understand the principles of SQL query language, if the language is unfamiliar to you, we recommend reading the block below:

SQL (Structured Query Language)

SQL (Structured Query Language) is a language used for managing databases. Imagine a database as a large Excel table where various data is stored - for example, store customer information, products, orders, etc. With SQL we can add data to tables, retrieve it, modify or delete it - all quickly and efficiently.

Here are the main principles and operators that will help you understand what SQL is:

  1. SELECT - is an operator that allows you to "fetch" needed data from a table. For example, if we have a table with customers, and we want to find out all their names, we write the query:

    SELECT name FROM customers;

    This is similar to selecting only one column in Excel.

  2. FROM - specifies where we get the data from. In the previous example, we selected the "name" column from the "customers" table.

  3. WHERE - operator for filtering data. For example, if we are interested in only those customers who live in Moscow:

    SELECT name FROM customers WHERE city = 'Moscow';

    This helps select only those rows that meet our conditions.

  4. INSERT INTO - allows adding new information to a table. For example, let's add a new customer:

    INSERT INTO customers (name, city) VALUES ('Ivan', 'Moscow');

    This is like adding a new row to the table.

  5. UPDATE - is used for modifying data. For example, if Ivan moved from Moscow to Saint Petersburg, we can update his data:

    UPDATE customers SET city = 'Saint Petersburg' WHERE name = 'Ivan';
  6. DELETE - deletes data from the table. If Ivan is no longer our client, we can delete him:

    DELETE FROM customers WHERE name = 'Ivan';
  7. Main Principles of SQL are that it works with tables, consisting of rows and columns. Rows are separate records (for example, data of a specific client), and columns are the properties of records (for example, name, age, city).

SQL can also join tables. For example, we have a table with orders and a table with customers, and we can join them to see which customers made which orders.

All this allows efficient data management, extracting necessary information, and performing any operations with a database. The important point is that SQL is quite simple to master, and even with minimal knowledge, you can start writing useful queries.

Queries for Table Selection

Important

Since tables can contain quite a large amount of data and in one go Suvvy can receive a multitude of rows from the table, it's very important:

  1. To pre-apply the maximum possible filter to the query to the table.

  2. To apply an artificial restriction on the number of selected data

If we're talking about point 1, you can filter queries by specifying what we need to pass to the table, in the example above, it was filtering by name and color. The color in this case allows narrowing the search.

Example of Numerical Selection

Additionally, we can apply filters to fields with type Date and number.

For example, if we want to specify that Suvvy provides materials only where the quantity is greater than 60, we can do this through the prompt/instruction as follows:

#INFORMATION ON BUILDING MATERIALS
If the client asks about the cost of building materials do the following actions:
1. Be sure to ask the client the name of the material and the color.
2. Then call the function list_of_materials, and pass the material and color and 
only those where quantity > 60.

In this case, additional conditions will be added to the SQL query to the table, and we will get a selection considering the additional filter:

Example of Selection Using LIMIT

Sometimes, despite filtering by columns, the selection still turns out to be large, thus overloading context and increasing query cost.

In this case, additionally, we can apply an artificial restriction on the number of records retrieved from the table using the LIMIT operator.

For instance, formulating the prompt in this way:

#INFORMATION ON BUILDING MATERIALS
If the client asks about the cost of building materials do the following actions:
1. Be sure to ask the client the name of the material and the color.
2. Then call the function list_of_materials, and pass the material and color and 
only those where quantity > 60, but no more than 1 record.

In this case, the phrase "no more than N records" allows placing a limitation on the table and getting the first N suitable records.

Here's what the table selection returned, with only one record:

In case such a prompt works unstably, you can additionally specify in the text in brackets a hint for the model:

#INFORMATION ON BUILDING MATERIALS
If the client asks about the cost of building materials do the following actions:
1. Be sure to ask the client the name of the material and the color.
2. Then call the function list_of_materials, and pass the material and color and 
only those where quantity > 60, but no more than 3 records (LIMIT operator).

Preliminary SQL Query

There are cases when in the column values we want to work with specific values. They are known to us beforehand.

For example, the column has values of cities like:

  • Moscow

  • Saint Petersburg

  • Volgograd

  • Zelenogorsk

  • Tyumen

But we know that according to the task conditions we need to find data only for the values

  • Saint Petersburg

  • Volgograd

  • Zelenogorsk

Based on what we discussed earlier, we can describe these selections in the prompt, for example:

When referring to the table function <table function> find values only by cities "
Saint Petersburg, Volgograd, Zelenogorsk.

And this will work, BUT, our task is always to increase the quality of answers and reduce the chance of errors. As we previously described in the section Best Practices the basis for increasing quality is reducing the instruction and where possible using algorithmic mechanisms.

Therefore, here the preliminary filter mechanism for the table comes to our aid, which is specified through the Preliminary SQL Query field. By writing a preliminary query to the table, we immediately get a ready selection only for these cities without the additional need to apply it within the instruction and thereby increasing the context and load on the model:

Working with Cross-tables

There is a type of tables known as cross-tables or data matrices.

In this table, data search occurs at the intersection of the first column values (usually representing categories) and the remaining columns (representing other data dimensions).

Cross-tables are very convenient for representing information when you need to visualize the relationship between two or more variables, which is often used in statistics and analytics.

For a specific example, imagine you have a table with sales data of products, where the first column is the product names, and the other columns are sales in different regions. Then each value within the table is the sales amount of a specific product in a specific region:

You might wonder, how to make selections in this case, and can Suvvy retrieve data from such tables on its own?

Yes! It can!

Firstly, even if we don’t describe anything forcibly - it will cope with the task using a simple prompt:

1. Ask the client about the material and in which cities they are interested in the information.
2. Refer to the table of materials by calling the list_of_materials function and 
pass the material and list of cities there.

The result we will receive is:

As seen, it understood in which columns what is located and applied the correct selections.

However, when it comes to large instructions, and if we're talking about the quality and stability of responses, additional techniques can be used.

We can add the following instruction template within the table description:

Use this template for searching data on materials and cities:
```sql
SELECT <cities> FROM data WHERE name = <material name>
Replace <cities> and <material name> with your search criteria.
```

A clearly specified template will provide better accuracy in large core instructions.

Last updated