Skip to content

feat: add polars file processor#472

Merged
kitagry merged 6 commits into
masterfrom
add-polars-file-processor
Dec 22, 2025
Merged

feat: add polars file processor#472
kitagry merged 6 commits into
masterfrom
add-polars-file-processor

Conversation

@kitagry

@kitagry kitagry commented Dec 15, 2025

Copy link
Copy Markdown
Member

Summary

This PR adds support for polars DataFrames alongside pandas DataFrames in gokart's file processors. This PR is very minimum support for Polars compared to #457 and #471

Features

  • Multi-backend support: File processors now support both pandas and polars DataFrames
    • CsvFileProcessor
    • JsonFileProcessor
    • ParquetFileProcessor
    • FeatherFileProcessor
  • Backward compatible: Defaults to pandas when no type parameter is specified

Usage Example

import polars as pl
from gokart import TaskOnKart

class MyPolarsTask(TaskOnKart):
    def output(self):
        return self.make_target('path/to/target.feather', processor=FeatherFileProcessor(dataframe_type='polars'))

    def run(self):
        df = pl.DataFrame({'a': [1, 2, 3]})
        self.dump(df)  # uses polars-compatible processor

class MyPandasTask(TaskOnKart):
    def output(self):
        return self.make_target('path/to/target.feather')

    def run(self):
        df = pd.DataFrame({'a': [1, 2, 3]})
        self.dump(df)  # Uses pandas processor (default behavior)

Why not #457 ?

In #457, we switch between Polars and Pandas using GOKART_DATAFRAME_FRAMEWORK. However, this implies that we cannot use both at the same time. Since projects often migrate from Pandas gradually, we should allow users to use both Polars and Pandas simultaneously.

This PR support minimum polars features, and then we will implement other useful features!

@kitagry kitagry requested a review from hiro-o918 as a code owner December 15, 2025 14:49
@kitagry kitagry force-pushed the add-polars-file-processor branch 3 times, most recently from 2c8bc26 to 8826a52 Compare December 15, 2025 15:01
@kitagry kitagry force-pushed the add-polars-file-processor branch from 8826a52 to c39e67a Compare December 15, 2025 15:24
Comment thread gokart/file_processor/polars.py Outdated
Comment thread test/file_processor/test_polars.py Outdated

@hirosassa hirosassa left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

@kitagry

kitagry commented Dec 19, 2025

Copy link
Copy Markdown
Member Author

@hiro-o918 I fixed. Could you review again?

@hiro-o918 hiro-o918 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kitagry

kitagry commented Dec 22, 2025

Copy link
Copy Markdown
Member Author

Thank you!

@kitagry kitagry merged commit f5cf886 into master Dec 22, 2025
7 checks passed
@kitagry kitagry deleted the add-polars-file-processor branch December 22, 2025 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants