dataset: Easy-to-Use Data Handling for SQL in Python

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

dataset is a powerful Python library designed to simplify interactions with SQL databases. It provides an intuitive, high-level API that makes reading and writing data as straightforward as working with JSON files. Key features include implicit table creation, efficient bulk loading, and robust transaction support, streamlining common database operations for developers.

It's important to note that as of version 1.0, dataset has split its data export features into a separate, standalone package called datafreeze.

Installation

Installing dataset is simple using pip:

$ pip install dataset

Examples

Here's a quick example demonstrating how to connect to a database, insert data, and query it using dataset:

import dataset

# Connect to an SQLite database (or any other SQL DB)
db = dataset.connect('sqlite:///mydatabase.db')

# Get a table, implicitly created if it doesn't exist
table = db['mytable']

# Insert data
table.insert(dict(name='John Doe', age=30))
table.insert(dict(name='Jane Smith', age=25))

# Find data based on conditions
print("People younger than 30:")
for row in table.find(age={'<': 30}):
    print(f"- {row['name']}")

# Update data
table.update(dict(name='John Doe', age=31), ['name'])
print("\nUpdated John Doe's age:")
print(table.find_one(name='John Doe'))

Why Use It

Dataset excels at simplifying common database tasks, making it an excellent choice for developers who need to interact with SQL data stores without the complexity of full-fledged ORMs. Its features, such as implicit table creation, bulk loading, and transaction management, significantly reduce boilerplate code. This allows for rapid data manipulation and exploration, making it particularly useful for scripting, data analysis, and developing small to medium-sized applications where speed and ease of use are paramount.