I've put my previous suggestion into an example. It is similar to many
other suggestions except it is 2 lines of code (by using a newer tool in
the standard library).
>>> countdict = collections.Counter('abcdeabcxyz')
>>> [v for v,c in countdict.items() if c > 1]
['a', 'c', 'b']
This should be O(N).
Wai Yip
On Fri, 08 Apr 2011 10:49:23 -0700, Tim Hatch <tim at timhatch.com> wrote:
> By some quick experiments, it looks like you can double the speed from
> your version by not doing all the reversed/sorted stuff and just keeping
> track of it in a much less exciting way:
>> def unique2(seq):
> seen = set()
> out = []
> for i in seq:
> if i not in seen:
> out.append(i)
> seen.add(i)
> return out
>> # yours
> $ python -m timeit -s 'from demo import unique' 'sum(unique([1, 1, 2, 3,
> 4]))'
> 100000 loops, best of 3: 5.07 usec per loop
> # mine
> $ python -m timeit -s 'from demo import unique2' 'sum(unique2([1, 1, 2,
> 3,
> 4]))'
> 100000 loops, best of 3: 2.65 usec per loop
>> You can actually make it a smidge faster still by making it a generator,
> for at least this tiny case (as it becomes bigger, into the thousands of
> items, it's much more like unique2's timings):
>> def unique3(seq):
> seen = set()
> for i in seq:
> if i not in seen:
> yield i
> seen.add(i)
>> $ python -m timeit -s 'from demo import unique3' 'sum(unique3([1, 1, 2,
> 3,
> 4]))'
> 1000000 loops, best of 3: 1.99 usec per loop
>> Tim
>>> Erratum: the function I suggested is incomplete without this line
>> following the docstring.
>>>> if seq is None: return None
>>>>>> On Thu, Apr 7, 2011 at 6:42 PM, Hy Carrinski <hcarrinski at gmail.com>
>> wrote:
>>> Dear BayPIGgies,
>>>>>> Thank you for providing exceptional reading material over the past
>>> year for an avid conversation follower and occasional meeting
>>> attendee.
>>>>>> I would like to join the conversation and contribute to the community
>>> beginning with this post. I will attend our meeting this month, and
>>> look forward to meeting at Noisebridge or elsewhere to watch PyCon
>>> 2011 videos together.
>>>>>> While thinking about unique lists in the context of Vikram's recent
>>> question, I came across a related blog post at
>>>http://www.peterbe.com/plog/uniqifiers-benchmark>>>>>> The post emphasizes speed and maintaining original list order when
>>> removing duplicates from a list.
>>>>>> Although the post is several years old, I thought of a solution to the
>>> question it poses that is different from the ones proposed by
>>> BayPIGgies either in the blog's comments or at a previous discussion
>>> (before reversed, sorted, and set were introduced in Python 2.4) at
>>>http://code.activestate.com/recipes/52560/>>>>>> I will be grateful for any comments on my ask forgiveness rather than
>>> permission implementation. It requires that all elements of the list
>>> be hashable and works for Python 2.x where x >= 4.
>>>>>> def unique(seq):
>>> Â Â """Return unique list of hashable objects while preserving
>>> order."""
>>> Â Â seen = {}
>>> Â Â for ix in reversed(xrange(len(seq))):
>>> Â Â Â Â seen[seq[ix]] = ix
>>> Â Â return sorted(seen,key=seen.__getitem__)
>>>>>> For the case of hashable objects are there disadvantages to this
>>> implementation versus the ones referenced above? Is there a reason to
>>> prefer operator.itemgetter over __getitem__ in this limited context?
>>> Is there an advantage in clarity or performance to using
>>> collections.OrderedDict in Python 2.7 and above? For lists containing
>>> a great number of distinct objects the sort step will introduce a
>>> lower bound on speed for this implementation.
>>>>>> I look forward to meeting more BayPIGgies in the coming months and
>>> years.
>>>>>> Cheers,
>>> Hy
>>>>> _______________________________________________
>> Baypiggies mailing list
>>Baypiggies at python.org>> To change your subscription options or unsubscribe:
>>http://mail.python.org/mailman/listinfo/baypiggies>>> _______________________________________________
> Baypiggies mailing list
>Baypiggies at python.org> To change your subscription options or unsubscribe:
>http://mail.python.org/mailman/listinfo/baypiggies