@kithinji/ldf
v1.0.32
Published
Create LLM datasets in a simple intuitive format
Readme
Transform LDF (lugha dataset format) to jsonl
Create LLM datasets in a simple intuitive format
Instruction
npm i -g @kithinji/ldfWrite the config file (ldf.json)
{
"src": "src",
"dist": "dist",
"shards": [
"lugha_dataset",
"text-to-sql.ldf"
],
"config": {
"tool": "to_assistant",
"reasoning": "hide"
}
}How your files can be structured
|---- dist
|---- src
| |----lugha_dataset
| | |----arrays.ldf
| | |----functions.ldf
| |----text-to-sql.ldf
|---- ldf.jsonThe configuration file helps ldf parse your dataset.
- src: The home directory
- dist: Where to write the
data.jsonlfile - shards: Where your data files are located
- You can import folders and the tool will read all files ending with
.ldfextension - You can also import individual files
- You can import folders and the tool will read all files ending with
Example of a conversation
conversation {
user {
content {
p { "What can you do for me?" }
}
}
assistant {
content {
reason {
p {
"Let me think. The user is asking what I can do for them."
"I have various tools in my arsenal that can help the user automate some tasks."
}
}
answer {
p { "I can read and reply your emails." }
}
}
}
}To compile the dataset run
ldf ldf.jsonLDF will then convert that to JSONL format
{"messages":[{ "role": "user", "content": "p { \"What can you do for me\""}"}, { "role": "assistant", "content": "reason { p { \"Let me think...\" } } answer { p { \"I can read and reply your emails\" } }"}]}