# Spreadsheet Helper

<figure><img src="/files/6X5kYUXW5wUsKYXaSMfU" alt="" width="183"><figcaption></figcaption></figure>

## Overview

SpreadSheet Helper is a tool designed for performing operations similar to spreadsheets and SQL like operations, such as data grouping and joining. You need to specify the data you want to work with by referencing it.

## Infer data types

To keep data type consistent, you can choose whether to enable data type inference or leave it disabled (it's turned off by default).

![Infer data types option in all relevant actions](/files/u16oE2xOYcEuDX7aEtpr)

If this option is turned on, the Helper will try to automatically infer the data type of each column. This might cause unexpected results when working with zip codes and other numbers that can start with 0.

Otherwise, the data types will be left as provided by the source.

## Actions

### 1. Group columns by

Group a spreadsheet or CSV file and apply different aggregates and properties.

The available aggregation functions are:

* mean
* sum
* size
* count
* std
* var
* sem
* first
* last
* min
* max
* median

For example, let's find the lowest price among devices. To do this we need to group columns by `Product`\
Result columns must use the following syntax:\
`<column name or index>;<aggregation function><new column name (optional)>`\
In the example \
Price - column name\
min - aggregation function\
lowest\_price - new column name (optional)

<figure><img src="/files/ZRclcKta1VUJpaZdRlfR" alt=""><figcaption></figcaption></figure>

### 2. Insert column

Insert a column into a spreadsheet[ - Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.insert.html) for further details

### 3. Join

Join multiple spreadsheets or tables with a common column (VLookup)[ - Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html) for further details.

This comes in handy if you want to map different tables based on common column data.

### 4. Query spreadsheet

The database is purely in memory and **SQLite** is used. See SQLite functions [here](https://sqlite.org/lang_corefunc.html).

Special cases not covered by SQLite:

* [Full outer join](https://www.sqlitetutorial.net/sqlite-full-outer-join/) - can be achieved like here

### 5. Remove duplicates

Remove duplicates from a spreadsheet[ - Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html) for further details

### 6. Remove specific labels

Remove a specified row or column[ - Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html) for further details

### 7. Return first n rows

Return the first n rows of a spreadsheet[ - Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) for further details

### 8. Sort by

Sort the spreadsheet by one or more columns - [Pandas Docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html) for further details


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.locoia.com/connectors/helpers/spreadsheet-helper.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
