dataset: Easy-to-Use Data Handling for SQL in Python

This repository profile is provided by osrepos.com, an open source repository discovery platform.

dataset: Easy-to-Use Data Handling for SQL in Python

Summary

Dataset is a Python library designed to simplify data handling for SQL data stores. It offers features like implicit table creation, bulk loading, and transaction support, making database interactions as straightforward as working with JSON files.

Repository Information

Analyzed by OSRepos on March 13, 2026

Topics

Click on any tag to explore related repositories

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

dataset is a powerful Python library designed to simplify interactions with SQL databases. It provides an intuitive, high-level API that makes reading and writing data as straightforward as working with JSON files. Key features include implicit table creation, efficient bulk loading, and robust transaction support, streamlining common database operations for developers.

It's important to note that as of version 1.0, dataset has split its data export features into a separate, standalone package called datafreeze.

Installation

Installing dataset is simple using pip:

$ pip install dataset

Examples

Here's a quick example demonstrating how to connect to a database, insert data, and query it using dataset:

import dataset

# Connect to an SQLite database (or any other SQL DB)
db = dataset.connect('sqlite:///mydatabase.db')

# Get a table, implicitly created if it doesn't exist
table = db['mytable']

# Insert data
table.insert(dict(name='John Doe', age=30))
table.insert(dict(name='Jane Smith', age=25))

# Find data based on conditions
print("People younger than 30:")
for row in table.find(age={'<': 30}):
    print(f"- {row['name']}")

# Update data
table.update(dict(name='John Doe', age=31), ['name'])
print("\nUpdated John Doe's age:")
print(table.find_one(name='John Doe'))

Why Use It

Dataset excels at simplifying common database tasks, making it an excellent choice for developers who need to interact with SQL data stores without the complexity of full-fledged ORMs. Its features, such as implicit table creation, bulk loading, and transaction management, significantly reduce boilerplate code. This allows for rapid data manipulation and exploration, making it particularly useful for scripting, data analysis, and developing small to medium-sized applications where speed and ease of use are paramount.

Links

Related repositories

Similar repositories that may be relevant next.

Guardrails: Enhancing LLM Reliability and Structured Data Generation

Guardrails: Enhancing LLM Reliability and Structured Data Generation

June 26, 2026

Guardrails is a Python framework designed to build reliable AI applications by adding guardrails to large language models. It helps detect, quantify, and mitigate risks in LLM inputs/outputs, and facilitates the generation of structured data. This framework ensures more predictable and safer interactions with AI models.

aifoundation-modelllm
Hiring Agent: An AI Agent for Resume Evaluation and Scoring

Hiring Agent: An AI Agent for Resume Evaluation and Scoring

June 26, 2026

Hiring Agent is an open-source AI agent designed to evaluate and score resumes objectively. It extracts structured data from PDF resumes, enriches it with GitHub profile signals, and provides a fair, explainable evaluation with detailed scores and evidence. This tool supports both local LLMs via Ollama and cloud-based options like Google Gemini.

PythonAIMachine Learning
LLM Guard: The Security Toolkit for LLM Interactions

LLM Guard: The Security Toolkit for LLM Interactions

June 26, 2026

LLM Guard is an open-source security toolkit developed by Protect AI, designed to fortify the safety of Large Language Models. It offers comprehensive protection against various threats, including prompt injection, data leakage, and harmful language, ensuring secure and reliable LLM interactions.

llm-securityprompt-injectionlarge-language-models
AuditNLG: Auditing Generative AI for Trustworthiness

AuditNLG: Auditing Generative AI for Trustworthiness

June 25, 2026

AuditNLG is an open-source library from Salesforce designed to enhance the trustworthiness of generative AI language models. It provides state-of-the-art techniques to detect and improve factualness, safety, and constraint adherence in AI-generated text. This library simplifies the process of auditing AI outputs, offering explanations and alternative suggestions for problematic content.

PythonGenerative AIAI Safety

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️