Skip to content

PERF: Frequency determination when concatting datetimes#65920

Open
rhshadrach wants to merge 1 commit into
pandas-dev:mainfrom
rhshadrach:perf_datetime_concat_freq
Open

PERF: Frequency determination when concatting datetimes#65920
rhshadrach wants to merge 1 commit into
pandas-dev:mainfrom
rhshadrach:perf_datetime_concat_freq

Conversation

@rhshadrach

Copy link
Copy Markdown
Member

Ref: pandas-dev/asv-runner#125

Addresses the bottom 4 benchmarks there.

import pandas as pd

idx = pd.date_range("2024-01-01", periods=30 * 24 * 3600, freq="s")
chunks = [
    pd.Series(range(len(c)), index=c) 
    for c in (idx[i:i + 2592] for i in range(0, len(idx), 2592))
]
%timeit pd.concat(chunks)
# 23.4 ms ± 186 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)   <--- main
# 16.8 ms ± 119 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)  <--- PR

# can check boundary continuity without boxing endpoints to
# Timestamps and doing per-pair offset arithmetic.
step = Timedelta(freq).as_unit(self.unit)._value
i8s = [idx._data._ndarray.view("i8") for idx in to_concat_nonempty]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know these all have the same unit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants