-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: merge_asof threshold minimum #61164
Comments
Thanks for the request!
Can you provide a full example here, namely the DataFrames for |
Thanks for the quick response! That phrasing was just taken from the current reference for this function in the examples section. I don't think it's paramount / actually important for my specific request. Is there any other info I can provide? |
As is, I do not understand the issue with the current features |
Ah I see, here's an example with the current usage:
Which effectively acts like a group by on "ticker" then finds the nearest other row with the same ticker that has a "time" within 10ms, but not the same time. I am proposing that by specifying a min tolerance as well, we'd have
Let me know if that makes sense as to why this feature is desired -- if not I can craft more examples or would be happy to have more discussion about this. Thanks! |
@Lituchy I believe you can achieve the same behavior with an auxiliary column
|
I think that could work in some cases -- like my posted example above -- however there are a few key features which would make this not necessarily work:
If you have any other ideas on how we may be able to deal with the above I'd be very curious to hear; I haven't been able to think of a way to get all this behavior without a min_threshold. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I often find myself using the merge_asof function on time series data. The tolerance and allow_exact_matches fields are very useful in filtering data, but it would be very useful to have more granular control over this tolerance. Being able to supply a minimum tolerance in addition to the currently existing maximum tolerance would be very beneficial in giving the user more control over this function.
Feature Description
A current example for this function is the following:
We only asof within 10ms between the quote time and the trade time
and we exclude exact matches on time. However prior data will
propagate forward
I am envisining a version where we could have
We only asof within 10ms between the quote time and the trade time but more than 2ms between the quote time and the trade time and we exclude exact matches on time. However prior data will propagate forward
Alternative Solutions
Another solution to this problem to augment the currently existing tolerance argument to accept a single datetimelike object, or a tuple of datetmelike objects which could act as a lower and upper bound, respectively.
Additional Context
No response
The text was updated successfully, but these errors were encountered: