Repository History
32 repositories tagged with machine-learning

PETSA: Parameter-Efficient Test-Time Adaptation for Time Series Forecasting
PETSA offers a parameter-efficient solution for Test-Time Adaptation (TTA) in time series forecasting, addressing the performance degradation caused by non-stationary data. It adapts pre-trained models during inference by updating small calibration modules, reducing memory and compute costs. This method, which includes low-rank adapters, dynamic gating, and a specialized loss, improves forecasting accuracy across diverse backbones and datasets.

Argo Workflows: A Cloud-Native Workflow Engine for Kubernetes
Argo Workflows is an open-source, container-native workflow engine designed for orchestrating parallel jobs on Kubernetes. It allows users to define multi-step workflows where each step is a container, modeling dependencies using directed acyclic graphs (DAGs). This CNCF graduated project is ideal for machine learning pipelines, data processing, and CI/CD.

Unstructured: Open-Source Pre-Processing for Complex Document Data
The `unstructured` library is an open-source ETL solution designed to convert complex, unstructured documents into clean, structured data. It streamlines the data processing workflow for language models, offering tools for ingesting and pre-processing various document types like PDFs, HTML, and Word documents. This library simplifies the transformation of raw information into formats suitable for advanced AI applications.
EasyEdit: An Easy-to-Use Knowledge Editing Framework for LLMs
EasyEdit is an open-source framework designed for efficient knowledge editing in Large Language Models (LLMs). It provides a unified, easy-to-use platform to modify, insert, or erase specific knowledge within LLMs without negatively impacting overall performance. This tool is crucial for aligning LLMs with evolving user needs and correcting factual inaccuracies.

RecDebiasing: A Comprehensive Collection of Recommendation Debiasing Methods
RecDebiasing is a valuable GitHub repository that curates a wide array of debiasing methods for recommendation systems. It compiles recent research papers, relevant datasets, and associated codebases, offering a centralized resource for understanding and addressing various biases. This collection is essential for researchers and practitioners focused on building more fair and accurate recommender systems.

awslabs/mcp: Enhance AI Assistants with AWS Model Context Protocol Servers
The awslabs/mcp repository offers a suite of specialized Model Context Protocol (MCP) servers designed to help users maximize their AWS experience. These servers enable seamless integration between Large Language Model (LLM) applications and various AWS services, providing AI assistants with real-time access to documentation, contextual guidance, and best practices. This enhances the quality and accuracy of AI-generated outputs for cloud development and operations.

EliteQuant: A Comprehensive List of Quantitative Finance Resources
EliteQuant is a meticulously curated GitHub repository offering a vast collection of online resources for quantitative modeling, trading, and portfolio management. This project serves as an invaluable hub for professionals and enthusiasts seeking tools, libraries, and knowledge in the complex world of quantitative finance. It aggregates platforms, systems, libraries, models, data sources, and more, making it a go-to reference.

Modular Platform: A Unified AI Development and Deployment Solution
The Modular Platform is an open, fully-integrated suite of AI libraries and tools, including MAX and Mojo, designed to accelerate model serving and scale GenAI deployments. It abstracts hardware complexity, enabling industry-leading GPU and CPU performance for popular open models without code changes. This powerful platform simplifies AI development and deployment for developers.

TabSTAR: A Tabular Foundation Model for Data with Text Fields
TabSTAR is an innovative tabular foundation model designed to effectively process tabular data that includes text fields. It offers a user-friendly package for integrating pretrained models into your own datasets, alongside a comprehensive research mode for advanced development and benchmarking. This powerful tool simplifies the application of deep learning to complex tabular structures.
NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots
NVIDIA Isaac GR00T N1.6 is an open vision-language-action (VLA) foundation model designed for generalized humanoid robot skills. It enables robots to perform manipulation tasks in diverse environments by taking multimodal input, including language and images. Researchers and professionals can leverage this model for fine-tuning on custom datasets and deploying it for inference.

GraphRAG: A Modular Graph-Based RAG System for LLM Discovery
GraphRAG, developed by Microsoft, is a powerful and modular graph-based Retrieval-Augmented Generation (RAG) system. It is designed to extract meaningful, structured data from unstructured text using Large Language Models (LLMs). This system enhances an LLM's ability to reason about private and narrative data by leveraging knowledge graph memory structures.

scikit-learn: The Essential Python Library for Machine Learning
scikit-learn is a widely-used open-source Python library for machine learning, built upon SciPy. It provides a comprehensive suite of tools for data mining and data analysis, making it an indispensable resource for developers and data scientists. With its extensive algorithms and user-friendly interface, scikit-learn simplifies complex machine learning tasks.