-
Notifications
You must be signed in to change notification settings - Fork 1.2k
memoizes data available on the ListBucket request #610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I am generally very in favor of this change, as this has been a long-standing issue. The primary issue is this is a non-backwards compatible change. For better or for worse, the existing implementation always calls #head_object and returns fresh/current values. Memoizing this values generally makes sense, but can introduce bugs in code that is relying on the non-static values. For example, I might write a script that is polling for a change to last_modified, expecting some other process to change the data. If I updated SDKs and then my script broke, I would be very unhappy. Given the SDK follows semver, users locked on version 1.x.y should be able to update within 1.x without breaking changes. I would be in favor of this change, if the default behavior could be maintained, and memoized attributes were opt-in. For example: s3 = AWS::S3.new(s3_cache_object_attributes: true)
s3.bucket['aws-sdk'].objects.each |obj|
# look ma, no head request!
puts obj.key + ' => ' + obj.etag
endThis could be accomplished by registering (in lib/aws/s3/config.rb) a new configuration option and then using As a side note, the v2 SDK has a very early preview of the S3 object interface available. V2 of the SDK works as expected out of the box and memoizes all resource data by default. s3 = Aws::S3::Resource.new
s3.bucket('aws-sdk').objects.each do |obj|
# look ma, no head request!
puts obj.key + ' => ' + obj.etag
endThe v2 |
|
Hey Trevor, thanks so much for your quick feedback. I agree that this patch isn't backwards compatible and that's a problem - one simplistic solution would be to make "force_refresh" default to true - but it felt too hacky even for scratching my own itch :) Anyway, I like the idea of adding a new configuration - would you say it's worth doing? It shouldn't take more than a couple hours to make this patch rely on a new configuration and be 100% backwards compatible. Or should I and everybody else just start using the RC of the new version? Best! |
|
Version 2 is different enough from v1 that is it not a drop-in replacement in a number of places. That said, they use different namespaces, so you can you use both gems in the same project. The lastest v1 release, 1.52.0 is available as the To answer your other question, I would be willing to merge this pull request given we can ensure backwards compatibility and the configuration option would be sufficiently simple enough for users to opt-in to use this feature. I have stopped feature work on v1, so I likely wont tackle this, but we welcome community contributions, and this seems like a good fit. If you would find it useful enough to stop using your fork and have this merged into mainline, I'd be happy to see that happen. |
|
Hey Trevor, there you go - took less than I expected. I sticked with the configuration name you mentioned-- can you think of a better one? Also removed the "force_refresh" option for simplicity sake. I guess one can just create a new instance of the object if they want to refresh its attributes (Rails has a "reload" method on ActiveRecord instances-- I think if we're going down that path, a full "reload" - which would just return a new, empty instance of the same S3Object - is probably simpler/better than having a "force_refresh" in each one of the attributes. Looks better? |
|
Also-- do you want me to squash all the commits? I couldn't find a way to do that without having to close this PR and creating a new one. |
|
Hello? |
|
Sorry, this fell off my radar. Looks good to me. |
memoizes data available on the ListBucket request
When listing objects within a bucket the server returns the etag, last-modified and size values for each object. The SDK simply ignores all those values when creating instances of S3Object and if they're needed it makes a HEAD request (one per object).
I didn't want to change the interface too much-- maybe using the options hash on S3Object isn't the best choice, but consider this patch as more of a conversation starter than anything.
I've been using my fork now and it saves up a lot of requests.
Any thoughts?