I am trying to determine the most efficient way to keep a user's rep_change history updated and need some guidance.

Here are a few assumptions that I am inferring from the available information:

the key for rep_change is post_id and user_id along with an implied site id

the on_date field is bumped when a rep related event for that post_id occurs

So, if I initialize a rep table with the user's complete rep history and save the time of the update, I can subsequently query /users/{id}/reputation with a fromdate equal to the last update and the results will contain all items that have had a rep change since fromdate - even if they have been returned in a previous update.

i.e.

post #10 got a few upvotes when created last year and was entered into my local rep table at that time.

the past xx number of rep updates have not contained that postid

today, someone found post #10 and upvoted it and the on_date and positive_rep are updated

the next time I query rep post #10 will be in the results and I can update the rep table using post_id (and user_id and contrived site_id) as key

Is this an accurate description?

No. and with clarification found in comment on accepted answer, understandably so.

If you're trying to track rep independently of the site, via the API, I think you'll tear your hair out. Rep changes like the -1 for downvoting and the +2 for accepting answers are not exposed at all, as far as I can tell.
–
Dave SwerskyJul 19 '10 at 15:26

@dave - yeah, i noticed that. In my story, a user's rep number will always be pulled from the user object, but the in use case of maintaining an up-to-date rep graph from day one, the variance incurred by downvoting and accepting are negligible and unavoidable given the available data. I am just trying to get a confirmation on the behavior of the rep_change object from the boss.
–
Sky SandersJul 19 '10 at 15:38

1 Answer
1

Its probably easiest to think of this route as a view onto a users reputation graph.

+/- a user's questions/answers is returned, but the on_date is intentionally ambiguous. on_date ends up being the last voting event that occurred in a collapsed group. All votes on a post in a given period our collapsed based on post_id. No indication is given as to when any vote but the last one was made, nor whether it was an up or down vote.

user_id is returned because the route is vectorized, in that use-case you need to be able to map a returned value back to a user.

The key on the results (once de-vectorized, for lack of a better word) is the post_id. Though for caching purposes, you probably want to key on the [post_id, on_date]-tuple, with a caveat.

That caveat is that since the underlying votes are collapsed into [on_date, post_id]-tuples based on the queried window, you have to be aware of that window when updating your cache.

Be aware that depending on how you're using this data, conceptually, there's no guarantee that post_id is unique in the stream.

understood as - the only way to get an accurate view of a complete rep graph is to pull it all in one go. e.g. window is similar to fromdate = 2006 and todate = 2020 and that the route was not designed with incremental updates in mind.
–
Sky SandersJul 19 '10 at 16:33

@code poet - depends what you're trying to update. If you're just trying to show the rep graph, you only have to pull "latest." If you're trying to do something more complicated, then... yeah, probably not. Intentionally obscuring voting data (for privacy purposes) makes the route less flexible than it could otherwise be.
–
Kevin Montrose♦Jul 19 '10 at 16:39

Once write-access is available, will this route be more flexible?
–
Nathan Osman♦Jul 19 '10 at 17:37

@geo - the data is computed, and dynamically depending on the window specified and is inherently read-only. the data is modified by reputation events on posts, so writability can have no affect on this route.
–
Sky SandersJul 19 '10 at 17:54

@code: I know that. The key here is that a write-able API will introduce an authentication mechanism, reducing the need for privacy.
–
Nathan Osman♦Jul 19 '10 at 19:17

@geo - in that case, the only events you should/would be provided more access to are your own and since you cannot perform rep related actions on your own posts, where is the value? so either way, viewing your own, where you have more access, you see no difference, and viewing others where you have no extra access, you see no difference. Am i missing somtehing?
–
Sky SandersJul 19 '10 at 19:26

@code: I'm just sayin'... not that it would be useful - just that your own votes would be there whether they could be used for anything or not.
–
Nathan Osman♦Jul 19 '10 at 19:40

@geo, i have other 'arguments' but a big show-stopper here is that a rep-graph would have to be queried and computed and cached for each person that pulls it and for some strange reason i don't see that happening.
–
Sky SandersJul 19 '10 at 19:54