Pkg Stats

`carbon.auth.getWhiteLabeling`

Returns whether or not the organization is white labeled and which integrations are white labeled

:param current_user: the current user :param db: the database session :return: a WhiteLabelingResponse

🛠️ Usage

const getWhiteLabelingResponse = await carbon.auth.getWhiteLabeling();

🔄 Return

WhiteLabelingResponse

🌐 Endpoint

/auth/v1/white_labeling GET

OrganizationUserDataSourceResponse

`carbon.dataSources.queryUserDataSources`

User Data Sources

🛠️ Usage

const queryUserDataSourcesResponse =
  await carbon.dataSources.queryUserDataSources({
    order_by: "created_at",
    order_dir: "desc",
  });

⚙️ Parameters

pagination: `Pagination`

order_by: `OrganizationUserDataSourceOrderByColumns`

order_dir: `OrderDir`

filters: `OrganizationUserDataSourceFilters`

🔄 Return

🌐 Endpoint

/user_data_sources POST

`carbon.dataSources.revokeAccessToken`

Revoke Access Token

🛠️ Usage

const revokeAccessTokenResponse = await carbon.dataSources.revokeAccessToken({
  data_source_id: 1,
});

⚙️ Parameters

data_source_id: `number`

🔄 Return

🌐 Endpoint

/revoke_access_token POST

`carbon.embeddings.getDocuments`

For pre-filtering documents, using tags_v2 is preferred to using tags (which is now deprecated). If both tags_v2 and tags are specified, tags is ignored. tags_v2 enables building complex filters through the use of "AND", "OR", and negation logic. Take the below input as an example:

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

"subject" = "holy-bible" OR
"person-of-interest" = "jesus christ" OR
"genre" != "religion" OR
"subject" = "tao-te-ching" AND "author" = "lao-tzu"

Note that the top level of the query must be either an "OR" or "AND" array. Currently, nesting is limited to 3. For tag blocks (those with "key", "value", and "negate" keys), the following typing rules apply:

"key" isn't optional and must be a string
"value" isn't optional and can be any or list[any]
"negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

When querying embeddings, you can optionally specify the media_type parameter in your request. By default (if not set), it is equal to "TEXT". This means that the query will be performed over files that have been parsed as text (for now, this covers all files except image files). If it is equal to "IMAGE", the query will be performed over image files (for now, .jpg and .png files). You can think of this field as an additional filter on top of any filters set in file_ids and

When hybrid_search is set to true, a combination of keyword search and semantic search are used to rank and select candidate embeddings during information retrieval. By default, these search methods are weighted equally during the ranking process. To adjust the weight (or "importance") of each search method, you can use the hybrid_search_tuning_parameters property. The description for the different tuning parameters are:

weight_a: weight to assign to semantic search
weight_b: weight to assign to keyword search

You must ensure that sum(weight_a, weight_b,..., weight_n) for all n weights is equal to 1. The equality has an error tolerance of 0.001 to account for possible floating point issues.

In order to use hybrid search for a customer across a set of documents, two flags need to be enabled:

Use the /modify_user_configuration endpoint to to enable sparse_vectors for the customer. The payload body for this request is below:

{
  "configuration_key_name": "sparse_vectors",
  "value": {
    "enabled": true
  }
}

Make sure hybrid search is enabled for the documents across which you want to perform the search. For the /uploadfile endpoint, this can be done by setting the following query parameter: generate_sparse_vectors=true

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

const getDocumentsResponse = await carbon.embeddings.getDocuments({
  query: "query_example",
  k: 1,
  include_all_children: false,
  media_type: "TEXT",
  embedding_model: "OPENAI",
});

⚙️ Parameters

query: `string`

Query for which to get related chunks and embeddings.

k: `number`

Number of related chunks to return.

tags: Record<string, `Tags1`>

A set of tags to limit the search to. Deprecated and may be removed in the future.

query_vector: `number`[]

Optional query vector for which to get related chunks and embeddings. It must have been generated by the same model used to generate the embeddings across which the search is being conducted. Cannot provide both query and query_vector.

file_ids: `number`[]

Optional list of file IDs to limit the search to

parent_file_ids: `number`[]

Optional list of parent file IDs to limit the search to. A parent file describes a file to which another file belongs (e.g. a folder)

include_all_children: `boolean`

Flag to control whether or not to include all children of filtered files in the embedding search.

tags_v2: `object`

A set of tags to limit the search to. Use this instead of tags, which is deprecated.

include_tags: `boolean`

Flag to control whether or not to include tags for each chunk in the response.

include_vectors: `boolean`

Flag to control whether or not to include embedding vectors in the response.

include_raw_file: `boolean`

Flag to control whether or not to include a signed URL to the raw file containing each chunk in the response.

hybrid_search: `boolean`

Flag to control whether or not to perform hybrid search.

hybrid_search_tuning_parameters: `HybridSearchTuningParamsNullable`

media_type: `FileContentTypesNullable`

Used to filter the kind of files (e.g. TEXT or IMAGE) over which to perform the search. Also plays a role in determining what embedding model is used to embed the query. If IMAGE is chosen as the media type, then the embedding model used will be an embedding model that is not text-only, regardless of what value is passed for embedding_model.

embedding_model: `EmbeddingGeneratorsNullable`

🔄 Return

DocumentResponseList

🌐 Endpoint

/embeddings POST

EmbeddingsAndChunksResponse

`carbon.embeddings.getEmbeddingsAndChunks`

Retrieve Embeddings And Content

🛠️ Usage

const getEmbeddingsAndChunksResponse =
  await carbon.embeddings.getEmbeddingsAndChunks({
    order_by: "created_at",
    order_dir: "desc",
    filters: {
      user_file_id: 1,
      embedding_model: "OPENAI",
    },
    include_vectors: false,
  });

⚙️ Parameters

filters: `EmbeddingsAndChunksFilters`

pagination: `Pagination`

order_by: `EmbeddingsAndChunksOrderByColumns`

order_dir: `OrderDir`

include_vectors: `boolean`

🔄 Return

🌐 Endpoint

/text_chunks POST

`carbon.embeddings.uploadChunksAndEmbeddings`

Upload Chunks And Embeddings

🛠️ Usage

const uploadChunksAndEmbeddingsResponse =
  await carbon.embeddings.uploadChunksAndEmbeddings({
    embedding_model: "OPENAI",
    chunks_and_embeddings: [
      {
        file_id: 1,
        chunks_and_embeddings: [
          {
            chunk_number: 1,
            chunk: "chunk_example",
          },
        ],
      },
    ],
    overwrite_existing: false,
    chunks_only: false,
  });

⚙️ Parameters

embedding_model: `EmbeddingGenerators`

chunks_and_embeddings: `SingleChunksAndEmbeddingsUploadInput`[]

overwrite_existing: `boolean`

chunks_only: `boolean`

custom_credentials: `{ [key: string]: object; }`

🔄 Return

🌐 Endpoint

/upload_chunks_and_embeddings POST

`carbon.files.createUserFileTags`

A tag is a key-value pair that can be added to a file. This pair can then be used for searches (e.g. embedding searches) in order to narrow down the scope of the search. A file can have any number of tags. The following are reserved keys that cannot be used:

db_embedding_id
organization_id
user_id
organization_user_file_id

Carbon currently supports two data types for tag values - string and list<string>. Keys can only be string. If values other than string and list<string> are used, they're automatically converted to strings (e.g. 4 will become "4").

🛠️ Usage

const createUserFileTagsResponse = await carbon.files.createUserFileTags({
  tags: {
    key: "string_example",
  },
  organization_user_file_id: 1,
});

⚙️ Parameters

tags: Record<string, `Tags1`>

organization_user_file_id: `number`

🔄 Return

🌐 Endpoint

/create_user_file_tags POST

`carbon.files.delete`

Delete File Endpoint

🛠️ Usage

const deleteResponse = await carbon.files.delete({
  fileId: 1,
});

⚙️ Parameters

fileId: `number`

🔄 Return

🌐 Endpoint

/deletefile/{file_id} DELETE

`carbon.files.deleteFileTags`

Delete File Tags

🛠️ Usage

const deleteFileTagsResponse = await carbon.files.deleteFileTags({
  tags: ["tags_example"],
  organization_user_file_id: 1,
});

⚙️ Parameters

tags: `string`[]

organization_user_file_id: `number`

🔄 Return

🌐 Endpoint

/delete_user_file_tags POST

`carbon.files.deleteMany`

Delete Files Endpoint

🛠️ Usage

const deleteManyResponse = await carbon.files.deleteMany({
  delete_non_synced_only: false,
  send_webhook: false,
  delete_child_files: false,
});

⚙️ Parameters

file_ids: `number`[]

sync_statuses: `ExternalFileSyncStatuses`[]

delete_non_synced_only: `boolean`

send_webhook: `boolean`

delete_child_files: `boolean`

🔄 Return

🌐 Endpoint

/delete_files POST

`carbon.files.deleteV2`

Delete Files V2 Endpoint

🛠️ Usage

const deleteV2Response = await carbon.files.deleteV2({
  send_webhook: false,
});

⚙️ Parameters

filters: `OrganizationUserFilesToSyncFilters`

send_webhook: `boolean`

🔄 Return

🌐 Endpoint

/delete_files_v2 POST

`carbon.files.getParsedFile`

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const getParsedFileResponse = await carbon.files.getParsedFile({
  fileId: 1,
});

⚙️ Parameters

fileId: `number`

🔄 Return

PresignedURLResponse

🌐 Endpoint

/parsed_file/{file_id} GET

`carbon.files.getRawFile`

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const getRawFileResponse = await carbon.files.getRawFile({
  fileId: 1,
});

⚙️ Parameters

fileId: `number`

🔄 Return

PresignedURLResponse

🌐 Endpoint

/raw_file/{file_id} GET

`carbon.files.queryUserFiles`

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

"subject" = "holy-bible" OR
"person-of-interest" = "jesus christ" OR
"genre" != "religion" OR
"subject" = "tao-te-ching" AND "author" = "lao-tzu"

"key" isn't optional and must be a string
"value" isn't optional and can be any or list[any]
"negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

🛠️ Usage

const queryUserFilesResponse = await carbon.files.queryUserFiles({
  order_by: "created_at",
  order_dir: "desc",
});

⚙️ Parameters

pagination: `Pagination`

order_by: `OrganizationUserFilesToSyncOrderByTypes`

order_dir: `OrderDir`

filters: `OrganizationUserFilesToSyncFilters`

include_raw_file: `boolean`

include_parsed_text_file: `boolean`

include_additional_files: `boolean`

🔄 Return

UserFilesV2

🌐 Endpoint

/user_files_v2 POST

`carbon.files.queryUserFilesDeprecated`

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const queryUserFilesDeprecatedResponse =
  await carbon.files.queryUserFilesDeprecated({
    order_by: "created_at",
    order_dir: "desc",
  });

⚙️ Parameters

pagination: `Pagination`

order_by: `OrganizationUserFilesToSyncOrderByTypes`

order_dir: `OrderDir`

filters: `OrganizationUserFilesToSyncFilters`

include_raw_file: `boolean`

include_parsed_text_file: `boolean`

include_additional_files: `boolean`

🔄 Return

🌐 Endpoint

/user_files POST

`carbon.files.resync`

Resync File

🛠️ Usage

const resyncResponse = await carbon.files.resync({
  file_id: 1,
  force_embedding_generation: false,
});

⚙️ Parameters

file_id: `number`

chunk_size: `number`

chunk_overlap: `number`

force_embedding_generation: `boolean`

🔄 Return

🌐 Endpoint

/resync_file POST

`carbon.files.upload`

This endpoint is used to directly upload local files to Carbon. The POST request should be a multipart form request. Note that the set_page_as_boundary query parameter is applicable only to PDFs for now. When this value is set, PDF chunks are at most one page long. Additional information can be retrieved for each chunk, however, namely the coordinates of the bounding box around the chunk (this can be used for things like text highlighting). Following is a description of all possible query parameters:

chunk_size: the chunk size (in tokens) applied when splitting the document
chunk_overlap: the chunk overlap (in tokens) applied when splitting the document
skip_embedding_generation: whether or not to skip the generation of chunks and embeddings
set_page_as_boundary: described above
embedding_model: the model used to generate embeddings for the document chunks
use_ocr: whether or not to use OCR as a preprocessing step prior to generating chunks (only valid for PDFs currently)
generate_sparse_vectors: whether or not to generate sparse vectors for the file. Required for hybrid search.
prepend_filename_to_chunks: whether or not to prepend the filename to the chunk text

🛠️ Usage

const uploadResponse = await carbon.files.upload({
  skipEmbeddingGeneration: false,
  setPageAsBoundary: false,
  embeddingModel: "OPENAI",
  useOcr: false,
  generateSparseVectors: false,
  prependFilenameToChunks: false,
  parsePdfTablesWithOcr: false,
  detectAudioLanguage: false,
  file: fs.readFileSync("/path/to/file"),
});

⚙️ Parameters

file: `Uint8Array | File | buffer.File`

chunkSize: `number`

Chunk size in tiktoken tokens to be used when processing file.

chunkOverlap: `number`

Chunk overlap in tiktoken tokens to be used when processing file.

skipEmbeddingGeneration: `boolean`

Flag to control whether or not embeddings should be generated and stored when processing file.

setPageAsBoundary: `boolean`

Flag to control whether or not to set the a page's worth of content as the maximum amount of content that can appear in a chunk. Only valid for PDFs. See description route description for more information.

embeddingModel: `TextEmbeddingGenerators`

Embedding model that will be used to embed file chunks.

useOcr: `boolean`

Whether or not to use OCR when processing files. Only valid for PDFs. Useful for documents with tables, images, and/or scanned text.

generateSparseVectors: `boolean`

Whether or not to generate sparse vectors for the file. This is required for the file to be a candidate for hybrid search.

prependFilenameToChunks: `boolean`

Whether or not to prepend the file's name to chunks.

maxItemsPerChunk: `number`

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parsePdfTablesWithOcr: `boolean`

Whether to use rich table parsing when use_ocr is enabled.

detectAudioLanguage: `boolean`

Whether to automatically detect the language of the uploaded audio file.

🔄 Return

🌐 Endpoint

/uploadfile POST

`carbon.files.uploadFromUrl`

Create Upload File From Url

🛠️ Usage

const uploadFromUrlResponse = await carbon.files.uploadFromUrl({
  url: "url_example",
  skip_embedding_generation: false,
  set_page_as_boundary: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  use_textract: false,
  prepend_filename_to_chunks: false,
  parse_pdf_tables_with_ocr: false,
  detect_audio_language: false,
});

⚙️ Parameters

url: `string`

file_name: `string`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

set_page_as_boundary: `boolean`

embedding_model: `EmbeddingGenerators`

generate_sparse_vectors: `boolean`

use_textract: `boolean`

prepend_filename_to_chunks: `boolean`

max_items_per_chunk: `number`

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parse_pdf_tables_with_ocr: `boolean`

detect_audio_language: `boolean`

🔄 Return

🌐 Endpoint

/upload_file_from_url POST

`carbon.files.uploadText`

🛠️ Usage

const uploadTextResponse = await carbon.files.uploadText({
  contents: "contents_example",
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
});

⚙️ Parameters

contents: `string`

name: `string`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

overwrite_file_id: `number`

embedding_model: `EmbeddingGeneratorsNullable`

generate_sparse_vectors: `boolean`

🔄 Return

🌐 Endpoint

/upload_text POST

`carbon.health.check`

Health

🛠️ Usage

const checkResponse = await carbon.health.check();

🌐 Endpoint

/health GET

ConnectDataSourceResponse

`carbon.integrations.connectDataSource`

Connect Data Source

🛠️ Usage

const connectDataSourceResponse = await carbon.integrations.connectDataSource({
  authentication: {
    source: "GOOGLE_DRIVE",
    access_token: "access_token_example",
  },
});

⚙️ Parameters

authentication: `AuthenticationProperty`

sync_options: `SyncOptions`

🔄 Return

🌐 Endpoint

/integrations/connect POST

`carbon.integrations.connectFreshdesk`

Refer this article to obtain an API key https://support.freshdesk.com/en/support/solutions/articles/215517. Make sure that your API key has the permission to read solutions from your account and you are on a paid plan. Once you have an API key, you can make a request to this endpoint along with your freshdesk domain. This will trigger an automatic sync of the articles in your "solutions" tab. Additional parameters below can be used to associate data with the synced articles or modify the sync behavior.

🛠️ Usage

const connectFreshdeskResponse = await carbon.integrations.connectFreshdesk({
  domain: "domain_example",
  api_key: "api_key_example",
  chunk_size: 1500,
  chunk_overlap: 20,
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  prepend_filename_to_chunks: false,
  sync_files_on_connection: true,
  sync_source_items: true,
});

⚙️ Parameters

domain: `string`

api_key: `string`

tags: `object`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

embedding_model: `EmbeddingGeneratorsNullable`

generate_sparse_vectors: `boolean`

prepend_filename_to_chunks: `boolean`

sync_files_on_connection: `boolean`

request_id: `string`

sync_source_items: `boolean`

Enabling this flag will fetch all available content from the source to be listed via list items endpoint

file_sync_config: `HelpdeskFileSyncConfigNullable`

🔄 Return

🌐 Endpoint

/integrations/freshdesk POST

`carbon.integrations.connectGitbook`

You will need an access token to connect your Gitbook account. Note that the permissions will be defined by the user generating access token so make sure you have the permission to access spaces you will be syncing. Refer this article for more details https://developer.gitbook.com/gitbook-api/authentication. Additionally, you need to specify the name of organization you will be syncing data from.

🛠️ Usage

const connectGitbookResponse = await carbon.integrations.connectGitbook({
  organization: "organization_example",
  access_token: "access_token_example",
  chunk_size: 1500,
  chunk_overlap: 20,
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  prepend_filename_to_chunks: false,
  sync_files_on_connection: true,
  sync_source_items: true,
});

⚙️ Parameters

organization: `string`

access_token: `string`

tags: `object`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

embedding_model: `EmbeddingGenerators`

generate_sparse_vectors: `boolean`

prepend_filename_to_chunks: `boolean`

sync_files_on_connection: `boolean`

request_id: `string`

sync_source_items: `boolean`

Enabling this flag will fetch all available content from the source to be listed via list items endpoint

🔄 Return

🌐 Endpoint

/integrations/gitbook POST

OrganizationUserDataSourceAPI

`carbon.integrations.createAwsIamUser`

Create a new IAM user with permissions to:

🛠️ Usage

const createAwsIamUserResponse = await carbon.integrations.createAwsIamUser({
  access_key: "access_key_example",
  access_key_secret: "access_key_secret_example",
  sync_source_items: true,
});

⚙️ Parameters

access_key: `string`

access_key_secret: `string`

sync_source_items: `boolean`

Enabling this flag will fetch all available content from the source to be listed via list items endpoint

🔄 Return

🌐 Endpoint

/integrations/s3 POST

`carbon.integrations.getOauthUrl`

This endpoint can be used to generate the following URLs

An OAuth URL for OAuth based connectors
A file syncing URL which skips the OAuth flow if the user already has a valid access token and takes them to the success state.

🛠️ Usage

const getOauthUrlResponse = await carbon.integrations.getOauthUrl({
  service: "GOOGLE_DRIVE",
  chunk_size: 1500,
  chunk_overlap: 20,
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  prepend_filename_to_chunks: false,
  sync_files_on_connection: true,
  set_page_as_boundary: false,
  connecting_new_account: false,
  request_id: "26453c8f-69ab-4eb3-bc25-0ca995b118a0",
  use_ocr: false,
  parse_pdf_tables_with_ocr: false,
  enable_file_picker: true,
  sync_source_items: true,
  incremental_sync: false,
});

⚙️ Parameters

service: `DataSourceType`

tags: `any`

scope: `string`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

embedding_model: `EmbeddingGeneratorsNullable`

zendesk_subdomain: `string`

microsoft_tenant: `string`

sharepoint_site_name: `string`

confluence_subdomain: `string`

generate_sparse_vectors: `boolean`

prepend_filename_to_chunks: `boolean`

max_items_per_chunk: `number`

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

salesforce_domain: `string`

sync_files_on_connection: `boolean`

Used to specify whether Carbon should attempt to sync all your files automatically when authorization is complete. This is only supported for a subset of connectors and will be ignored for the rest. Supported connectors: Intercom, Zendesk, Gitbook, Confluence, Salesforce, Freshdesk

set_page_as_boundary: `boolean`

data_source_id: `number`

Used to specify a data source to sync from if you have multiple connected. It can be skipped if you only have one data source of that type connected or are connecting a new account.

connecting_new_account: `boolean`

Used to connect a new data source. If not specified, we will attempt to create a sync URL for an existing data source based on type and ID.

request_id: `string`

This request id will be added to all files that get synced using the generated OAuth URL

use_ocr: `boolean`

Enable OCR for files that support it. Supported formats: pdf

parse_pdf_tables_with_ocr: `boolean`

enable_file_picker: `boolean`

Enable integration\'s file picker for sources that support it. Supported sources: SHAREPOINT, DROPBOX, BOX, ONEDRIVE, GOOGLE_DRIVE

sync_source_items: `boolean`

Enabling this flag will fetch all available content from the source to be listed via list items endpoint

incremental_sync: `boolean`

Only sync files if they have not already been synced or if the embedding properties have changed. This flag is currently supported by ONEDRIVE, GOOGLE_DRIVE, BOX, DROPBOX. It will be ignored for other data sources.

file_sync_config: `HelpdeskFileSyncConfigNullable`

🔄 Return

OuthURLResponse

🌐 Endpoint

/integrations/oauth_url POST

`carbon.integrations.listConfluencePages`

To begin listing a user's Confluence pages, at least a data_source_id of a connected Confluence account must be specified. This base request returns a list of root pages for every space the user has access to in a Confluence instance. To traverse further down the user's page directory, additional requests to this endpoint can be made with the same data_source_id and with parent_id set to the id of page from a previous request. For convenience, the has_children property in each directory item in the response list will flag which pages will return non-empty lists of pages when set as the parent_id.

🛠️ Usage

const listConfluencePagesResponse =
  await carbon.integrations.listConfluencePages({
    data_source_id: 1,
  });

⚙️ Parameters

data_source_id: `number`

parent_id: `string`

🔄 Return

ListResponse

🌐 Endpoint

/integrations/confluence/list POST

ListDataSourceItemsResponse

`carbon.integrations.listDataSourceItems`

List Data Source Items

🛠️ Usage

const listDataSourceItemsResponse =
  await carbon.integrations.listDataSourceItems({
    data_source_id: 1,
    order_by: "name",
    order_dir: "asc",
  });

⚙️ Parameters

data_source_id: `number`

parent_id: `string`

filters: `ListItemsFiltersNullable`

pagination: `Pagination`

order_by: `ExternalSourceItemsOrderBy`

order_dir: `OrderDirV2`

🔄 Return

🌐 Endpoint

/integrations/items/list POST

`carbon.integrations.listFolders`

After connecting your Outlook account, you can use this endpoint to list all of your folders on outlook. This includes both system folders like "inbox" and user created folders.

🛠️ Usage

const listFoldersResponse = await carbon.integrations.listFolders({});

⚙️ Parameters

dataSourceId: `number`

🌐 Endpoint

/integrations/outlook/user_folders GET

`carbon.integrations.listGitbookSpaces`

After connecting your Gitbook account, you can use this endpoint to list all of your spaces under current organization.

🛠️ Usage

const listGitbookSpacesResponse = await carbon.integrations.listGitbookSpaces({
  dataSourceId: 1,
});

⚙️ Parameters

dataSourceId: `number`

🌐 Endpoint

/integrations/gitbook/spaces GET

`carbon.integrations.listLabels`

After connecting your Gmail account, you can use this endpoint to list all of your labels. User created labels will have the type "user" and Gmail's default labels will have the type "system"

🛠️ Usage

const listLabelsResponse = await carbon.integrations.listLabels({});

⚙️ Parameters

dataSourceId: `number`

🌐 Endpoint

/integrations/gmail/user_labels GET

`carbon.integrations.listOutlookCategories`

After connecting your Outlook account, you can use this endpoint to list all of your categories on outlook. We currently support listing up to 250 categories.

🛠️ Usage

const listOutlookCategoriesResponse =
  await carbon.integrations.listOutlookCategories({});

⚙️ Parameters

dataSourceId: `number`

🌐 Endpoint

/integrations/outlook/user_categories GET

`carbon.integrations.listRepos`

Once you have connected your GitHub account, you can use this endpoint to list the repositories your account has access to. You can use a data source ID or username to fetch from a specific account.

🛠️ Usage

const listReposResponse = await carbon.integrations.listRepos({
  perPage: 30,
  page: 1,
});

⚙️ Parameters

perPage: `number`

page: `number`

dataSourceId: `number`

🌐 Endpoint

/integrations/github/repos GET

`carbon.integrations.syncConfluence`

After listing pages in a user's Confluence account, the set of selected page ids and the connected account's data_source_id can be passed into this endpoint to sync them into Carbon. Additional parameters listed below can be used to associate data to the selected pages or alter the behavior of the sync.

🛠️ Usage

const syncConfluenceResponse = await carbon.integrations.syncConfluence({
  data_source_id: 1,
  ids: ["string_example"],
  chunk_size: 1500,
  chunk_overlap: 20,
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  prepend_filename_to_chunks: false,
  set_page_as_boundary: false,
  request_id: "3d0330f2-f2e4-482b-9ca7-91d3a1bbbd18",
  use_ocr: false,
  parse_pdf_tables_with_ocr: false,
  incremental_sync: false,
});

⚙️ Parameters

data_source_id: `number`

ids: `IdsProperty`

tags: `object`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

embedding_model: `EmbeddingGeneratorsNullable`

generate_sparse_vectors: `boolean`

prepend_filename_to_chunks: `boolean`

max_items_per_chunk: `number`

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

set_page_as_boundary: `boolean`

request_id: `string`

use_ocr: `boolean`

parse_pdf_tables_with_ocr: `boolean`

incremental_sync: `boolean`

file_sync_config: `HelpdeskGlobalFileSyncConfigNullable`

🔄 Return

🌐 Endpoint

/integrations/confluence/sync POST

OrganizationUserDataSourceAPI

`carbon.integrations.syncDataSourceItems`

Sync Data Source Items

🛠️ Usage

const syncDataSourceItemsResponse =
  await carbon.integrations.syncDataSourceItems({
    data_source_id: 1,
  });

⚙️ Parameters

data_source_id: `number`

🔄 Return

🌐 Endpoint

/integrations/items/sync POST

`carbon.integrations.syncFiles`

After listing files and folders via /integrations/items/sync and integrations/items/list, use the selected items' external ids as the ids in this endpoint to sync them into Carbon. Sharepoint items take an additional parameter root_id, which identifies the drive the file or folder is in and is stored in root_external_id. That additional paramter is optional and excluding it will tell the sync to assume the item is stored in the default Documents drive.

🛠️ Usage

const syncFilesResponse = await carbon.integrations.syncFiles({
  data_source_id: 1,
  ids: ["string_example"],
  chunk_size: 1500,
  chunk_overlap: 20,
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  prepend_filename_to_chunks: false,
  set_page_as_boundary: false,
  request_id: "3d0330f2-f2e4-482b-9ca7-91d3a1bbbd18",
  use_ocr: false,
  parse_pdf_tables_with_ocr: false,
  incremental_sync: false,
});

⚙️ Parameters

data_source_id: `number`

ids: `IdsProperty`

tags: `object`

chunk_size: `number`

chunk_overlap: `number`

skip_embedding_generation: `boolean`

embedding_model: `EmbeddingGeneratorsNullable`

generate_sparse_vectors: `boolean`

prepend_filename_to_chunks: `boolean`

max_items_per_chunk: `number`

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

set_page_as_boundary: `boolean`

request_id: `string`

use_ocr: `boolean`

parse_pdf_tables_with_ocr: `boolean`

incremental_sync: `boolean`

file_sync_config: `HelpdeskGlobalFileSyncConfigNullable`

🔄 Return

🌐 Endpoint

/integrations/files/sync POST