Getting started

Let's start by creating our chains backend! Create a repository with a chains folder inside it. Our CLI will automatically deploy all chains we create in this folder.

Speaking of CLI, let's install it!

npm install -g @relevanceai/chain

Next, let's authenticate:

relevance login

After following the instructions, you'll be set up and ready to go! Make sure to also add your Open AI key.

Don’t want to read the details? There’s a demo repo and video at the bottom of this guide! You can check out the code and run it yourself.

Check out a video of the final product!

What’s the plan?

We want to create an agent that will look at a table of data and think “what interesting insight can I find here?“. It should then write SQL queries to find those insights and build a dashboard full of charts.

To do this, we want to create three key chains that interact with each other.

  1. We want a chain that takes in the SQL table metadata and starts thinking of interesting questions to ask the data.

  2. The next chain should take those questions and figure out what SQL queries will help us find the data to answer them.

  3. We then want to run these SQL queries on our database.

  4. The final chain should take the retrieved data, as well as the question we are trying to answer, and create a chart for our dashboard.

Let’s get started!

Asking questions of the data

Our first chain needs to:

  1. Receive metadata about the SQL table we want to analyse.
  2. Think of some interesting questions to ask, based on the metadata.

To do this, we will have to first define the params our chain should receive. In this case, the SQL table’s name and it’s columns.

export default defineChain({
    params: {
        amountToGenerate: {
            type: 'number',
        },
        tableName: {
            type: 'string',
        },
        tableColumns: {
            type: 'array',
            items: {
                type: 'object',
                properties: {
                    column: {
                        type: 'string',
                    },
                    type: {
                        type: 'string',
                    },
                },
            },
        },
    },
});

We can tell the chain how many questions to ask via the amountToGenerate param.

Next, let’s start declaring the chain steps.

We’ll start by converting our array of table columns into a string, it can be used in an LLM prompt. We can do this by our code utility, that allows us to run Javascript as part of our chain.

export default defineChain({
    ...restOfChain,
    setup({ params, step, code }) {
        // Destructure the chain params for use in steps
        const { tableName, tableColumns, amountToGenerate } = params;

        // run a code step to reduce the table columns into a string
        const { transformed: joinedColumns } = code(
            { tableColumns },
            ({ tableColumns }) => {
                return tableColumns.reduce((acc, column) => {
                    return acc + `${column.column} (${column.type}),`;
                }, 'The SQL table columns are:');
            }
        );
    },
});

Next, let’s create an LLM prompt step that asks some questions based on the table name and the columns.

// in the setup function
const { answer: queriesString } = step('prompt_completion', {
    prompt: `You are looking at the SQL table called: ${tableName}.
${joinedColumns}.

Give me a JSON array of the top ${amountToGenerate} questions you might want to ask of this table. They should be about varying topics. They shouldn't be answerable with only one thing.`,

    system_prompt: `You are a business analyst. You must respond with good questions to ask of your SQL database. You should respond with ${amountToGenerate} options always. You should respond in the format: [question].`,
});

Note that we ask the LLM to return us a string that looks like an array. To turn it into an actual array, we can use a Javascript code step again:

const { transformed: arrayQueries } = code(
    { queriesString },
    ({ queriesString }) => {
        let queries = [];
        try {
            // convert the string into an array
            queries = JSON.parse(queriesString);
        } catch (e) {}

        return queries;
    }
);

Finally, our chain should return the arrayQueries:

return {
    queries: arrayQueries,
};

Now in our frontend code, we’re able to run this chain:

const { queries } = await client.runChain<typeof nlQueryGenerator>(
    'nl-query-generator',
    {
        // table name and columns retrieved from our SQL schema
        tableName: TABLE_NAME,
        tableColumns: TABLE_COLUMNS,
        amountToGenerate: 10,
    }
);

Figuring out how to answer the questions with SQL queries

Now we have a list of natural language questions, based on our SQL table - let’s figure out how to retrieve the data we need to answer them!

We’ll start by defining the params our chain should receive. This chain will convert one question at a time to SQL, so it should receive the natural language query as well as the table metadata.

You could make this chain receive and handle all the questions if you’d prefer! I opted for one at a time, purely for frontend effect (the ability to mock a streaming-esque interface).

{
  naturalLanguageQuery: {
    type: 'string'
  },
  tableName: {
    type: 'string'
  },
  tableColumns: {
    type: 'array',
    items: {
      type: 'object',
      properties: {
        column: {
          type: 'string'
        },
        type: {
          type: 'string'
        }
      }
    }
  }
}

We’ll start our setup with the same reduction as before, to get our table metadata ready for LLMs.

// in the setup function
const { naturalLanguageQuery, tableColumns, tableName } = params;

const { transformed: joinedColumns } = code(
    { tableColumns },
    ({ tableColumns }) => {
        return tableColumns.reduce((acc, column) => {
            return acc + `${column.column} (${column.type}),`;
        }, 'The SQL table columns are:');
    }
);

Next, we’ll create an LLM prompt to identify which part of the natural language query is most relevant to generating an SQL query.

// Set this variable up for re-use across prompts in this chain.
const preface = `The table name is: ${tableName}. The database columns are: ${joinedColumns}. A user has made this request about a SQL query:`;

const { answer: relevantQuerySection } = step('prompt_completion', {
    prompt: `${preface} "${naturalLanguageQuery}". Identify and respond with what part of this request is relevant to generating an SQL query, suitable for a MySQL database.`,
});

This way, if the question contains a lot of extra information, we can extract the petinent part to querying a database.

Next, let’s take this relevant section and ask the LLM to generate a valid SQL query.

const { answer: rawQuery } = step('prompt_completion', {
    prompt: `${preface} "${relevantQuerySection}". Identify and respond with the SQL query that would satisfy this request. Don't include any explanation of the query, just the query itself. Limit the page size to 10 records.`,
    system_prompt:
        'You must only return valid SQL. Never provide any explanations or other data.',
});

Perfect!

Generating charts

Once we have our SQL queries, we can use them to retrieve data from our database. In this guide’s accompanying repo, we use Prisma to connect to a Planetscale database.

We take the results from the database and feed it into our final chain, which generates a chart!

We’ll ask it to generate a configuration for Chart.js, so we can display the charts in the frontend.

// params
{
  results: {
    type: 'array',
    items: {
      type: 'object'
    }
  },
  naturalLanguageQuery: {
    type: 'string'
  },
}

For each set of results, we’ll pass in the results as well as the original question. This way the chart can be designed to answer the question.

In the first chain step, we’ll ask the LLM to select an appropriate type of chart for this data.

// in setup
const { results, naturalLanguageQuery } = params;

const { answer: typeOfChart } = step('prompt_completion', {
    prompt: `The user has requested this data: "${naturalLanguageQuery}". These are the types of charts you know how to create: 'bar', 'doughnut', 'line', 'bubble' and 'radar'. This is what the data looks like: ${results[0]}. 

Identify and respond with the type of chart that would be most appropriate for this request.`,
});

Then, we use a code block to prepare an example configuration for that chart type.

const { transformed: exampleCode } = code(
    { typeOfChart },
    ({ typeOfChart }) => {
        const examples: Record<string, any> = {
            line: `{
  type: 'line',
  data: data: {
  labels: labels,
  datasets: [{
    label: 'My First Dataset',
    data: [65, 59, 80, 81, 56, 55, 40],
    fill: false,
    borderColor: 'rgb(75, 192, 192)',
    tension: 0.1
  }]
},
  options: {
    indexAxis: 'y',
    scales: {
      x: {
        beginAtZero: true
      }
    }
  }
}`,
            bar: `{
  type: 'bar',
  data: {
  labels: labels,
  datasets: [{
    label: 'My First Dataset',
    data: [65, 59, 80, 81, 56, 55, 40],
    backgroundColor: [
      'rgba(255, 99, 132, 0.2)',
      'rgba(255, 159, 64, 0.2)',
      'rgba(255, 205, 86, 0.2)',
      'rgba(75, 192, 192, 0.2)',
      'rgba(54, 162, 235, 0.2)',
      'rgba(153, 102, 255, 0.2)',
      'rgba(201, 203, 207, 0.2)'
    ],
    borderColor: [
      'rgb(255, 99, 132)',
      'rgb(255, 159, 64)',
      'rgb(255, 205, 86)',
      'rgb(75, 192, 192)',
      'rgb(54, 162, 235)',
      'rgb(153, 102, 255)',
      'rgb(201, 203, 207)'
    ],
    borderWidth: 1
  }]
},
  options: {
    indexAxis: 'y',
  }
};`,
            doughnut: `{
  type: 'doughnut',
  data: {
  labels: [
    'Red',
    'Blue',
    'Yellow'
  ],
  datasets: [{
    label: 'My First Dataset',
    data: [300, 50, 100],
    backgroundColor: [
      'rgb(255, 99, 132)',
      'rgb(54, 162, 235)',
      'rgb(255, 205, 86)'
    ],
    hoverOffset: 4
  }]
},
}`,
            bubble: `{
  datasets: [{
    label: 'First Dataset',
    data: [{
      x: 20,
      y: 30,
      r: 15
    }, {
      x: 40,
      y: 10,
      r: 10
    }],
    backgroundColor: 'rgb(255, 99, 132)'
  }]
}`,
            radar: `{
  type: 'radar',
  data: {
  labels: [
    'Eating',
    'Drinking',
    'Sleeping',
    'Designing',
    'Coding',
    'Cycling',
    'Running'
  ],
  datasets: [{
    label: 'My First Dataset',
    data: [65, 59, 90, 81, 56, 55, 40],
    fill: true,
    backgroundColor: 'rgba(255, 99, 132, 0.2)',
    borderColor: 'rgb(255, 99, 132)',
    pointBackgroundColor: 'rgb(255, 99, 132)',
    pointBorderColor: '#fff',
    pointHoverBackgroundColor: '#fff',
    pointHoverBorderColor: 'rgb(255, 99, 132)'
  }, {
    label: 'My Second Dataset',
    data: [28, 48, 40, 19, 96, 27, 100],
    fill: true,
    backgroundColor: 'rgba(54, 162, 235, 0.2)',
    borderColor: 'rgb(54, 162, 235)',
    pointBackgroundColor: 'rgb(54, 162, 235)',
    pointBorderColor: '#fff',
    pointHoverBackgroundColor: '#fff',
    pointHoverBorderColor: 'rgb(54, 162, 235)'
  }]
},
  options: {
    elements: {
      line: {
        borderWidth: 3
      }
    }
  },
}`,
        };

        return examples[typeOfChart] ?? examples.bar;
    }
);

Finally, we ask the LLM to create a chart configuration! Note, we also ask it to return a custom key called “original_question” - so we can display it in the frontend. Thanks!

const { answer: chartConfiguration } = step('prompt_completion', {
    prompt: `Here is an example ChartJS config:

${exampleCode}
      
Here is some data: ${results}.
      
Create a chartjs.org chart of type ${typeOfChart} that would be appropriate for this data. Create a valid ChartJS config for this chart. Only return the valid JSON config. Don't use more than 5 colours. Make sure to avoid this error: Give the chart a relevant title, based on this title: ${naturalLanguageQuery}. Include a key in the config called "original_query" that contains: ${naturalLanguageQuery}.`,
    system_prompt:
        'You are a ChartJS config creator. You must only return valid JSON. Never provide any explanations or other data.',
});

Piecing it all together!

I’ve created a repo that pieces these chains together to create a really cool business analyst agent!

You’ll notice I even add the ability for the user to ask questions at the end! There are some edge cases you can find that break the agent, but it’s a great start! With more prompt engineering, particularly if you know the sort of data that will be processed, you can make this agent even more robust.

Check it out on Github! It uses Planetscale for the database.

Check out this video of it in action: