Skip to content

New Features on _.each #6

@dwt

Description

@dwt

Ah, all clear regarding the missing comment. But yes, I would like to add as many overloads as are possible with _.each - so keep coming back if you find something that can be implemented.

With some delay... :)

_.each

  • __rmul__ to support 2*_.each. Similar for other __r...__ functions, see e.g. _(dir(int)).filter(lambda it: it.startswith("__r")).to(list).
  • Invoking methods on _.each doesn't work as expected, e.g. _(dir(int)).filter(_.each.startswith("__r")).to(list) doesn't filter anything out.
  • The ,$,| operators could be overloaded as eager boolean operators (like numpy does). Though precedence rules are an issue, it would allow writing _(range(10)).filter( (_.each > 3) & (_.each < 7) ).to(list).

Prefices

Regarding the prefixes: I really don't know what the best way here would be. I kind of like that the default case is not lazy and I find the I prefix intuitive (though python2-ish). At the same time I haven't found a good prefix for a non lazy variant.

That being said - I tried at least to be internally consistent, but am of course open to suggestions.

"Iterator by default" can be annoying for debugging, since lazy evaluation often doesn't produce useful stack traces.
"Iterator by default" can be annoying for debugging, because it delays the failure. In

import fluentpy as _

def fail(value):
    assert False, "fail"

items = _(range(10)).imap(fail)

for item in items:
    print(item)

the line where items is defined doesn't appear in the backtrace while with map it would. So, for my personal use-cases your current choice almost always will be preferable actually.

On the other hand, making eager evaluation the default means that code written with fluentpy will be brittle. The original purpose of a function might work well with _(...).filter, but will suddenly fail if the passed iterable is large or infinite.

Documentation (1)

It could be useful to point out, that the wrappers can be used as decorators to side-step the limited features of lambdas. E.g.

items = _(range(5))
@items.call
def items(its):
    for it in its:
        yield it
        yield it
print(items.to(list))

This would be yet another advantage of the suffix notation for filtering effects...

Documentation (2)

Ad-hoc, I thought it might be useful to provide custom extensions, e.g. let's say implementing a .groupby that groups non-consecutive elements. Until I realized, that this is covered by .call. It might be worth pointing this out explicitly.

Subprocess utility

A wrapper around subprocess.Popen would be useful, that allows to do something like

lines = _(["hello","world"]).pipe(["sed", "s/l/x/g"]).to(list)

It can be done with .call and a function wrapping around subprocess.Popen; In order to prevent deadlocks, it requires multiple threads though; I had some success building a function that allowed _(...).call(pipe(...)).... by having a thread feed the input iterator to the Popen.stdin and returning an iterator over Popen.stdout.

Originally posted by @kbauer in #5 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions