[zz]Why does Python's __import__ require fromlist?
http://www.readmespot.com/question/o/2724260/why-does-python-s-__import__-require-fromlist-
In Python, if you want to programmatically import a module, you can do:
module = __import__('module_name')
If you want to import a submodule, you would think it would be a simple matter of:
module = __import__('module_name.submodule')
Of course, this doesn't work; you just get module_name
again. You have to do:
module = __import__('module_name.submodule', fromlist=['blah'])
Why? The actual value of fromlist
don't seem to matter at all, as long as it's non-empty. What is the point of requiring an argument, then ignoring its values?
Most stuff in Python seems to be done for good reason, but for the life of me, I can't come up with any reasonable explanation for this behavior to exist.
Selected Answer
In fact, the behaviour of __import__()
is entirely because of the implementation of the import
statement, which calls __import__()
. There's basically five slightly different ways __import__()
can be called by import
(with two main categories):
import pkg
import pkg.mod
from pkg import mod, mod2
from pkg.mod import func, func2
from pkg.mod import submod
In the first and the second case, the import
statement should assign the "left-most" module object to the "left-most" name: pkg
. After import pkg.mod
you can dopkg.mod.func()
because the import
statement introduced the local name pkg
, which is a module object that has a mod
attribute. So, the __import__()
function has to return the "left-most" module object so it can be assigned to pkg
. Those two import statements thus translate into:
pkg = __import__('pkg')
pkg = __import__('pkg.mod')
In the third, fourth and fifth case, the import
statement has to do more work: it has to assign to (potentially) multiple names, which it has to get from the module object. The __import__()
function can only return one object, and there's no real reason to make it retrieve each of those names from the module object (and it would make the implementation a lot more complicated.) So the simple approach would be something like (for the third case):
tmp = __import__('pkg')
mod = tmp.mod
mod2 = tmp.mod2
However, that won't work if pkg
is a package and mod
or mod2
are modules in that package that are not already imported, as they are in the third and fifth case. The __import__()
function needs to know that mod
and mod2
are names that theimport
statement will want to have accessible, so that it can see if they are modules and try to import them too. So the call is closer to:
tmp = __import__('pkg', fromlist=['mod', 'mod2'])
mod = tmp.mod
mod2 = tmp.mod2
which causes __import__()
to try and load pkg.mod
and pkg.mod2
as well as pkg
(but if mod
or mod2
don't exist, it's not an error in the __import__()
call; producing an error is left to the import
statement.) But that still isn't the right thing for the fourth and fifth example, because if the call were so:
tmp = __import__('pkg.mod', fromlist=['submod'])
submod = tmp.submod
then tmp
would end up being pkg
, as before, and not the pkg.mod
module you want to get the submod
attribute from. The implementation could have decided to make it so the import
statement does extra work, splitting the package name on .
like the__import__()
function already does and traversing the names, but this would have meant duplicating some of the effort. So, instead, the implementation made__import__()
return the right-most module instead of the left-most one if and only if fromlist is passed and not empty.
(The import pkg as p
and from pkg import mod as m
syntax doesn't change anything about this story except which local names get assigned to -- the __import__()
function sees nothing different when as
is used, it all remains in the import
statement implementation.)
The answer can found found in the documentation for __import__
:
The fromlist should be a list of names to emulate
from name import ...
, or an empty list to emulateimport name
.When importing a module from a package, note that
__import__('A.B', ...)
returns package A when fromlist is empty, but its submodule B when fromlist is not empty.
So basically, that's just how the implementation of __import__
works: if you want the submodule, you pass a fromlist
containing something you want to import from the submodule, and the implementation if __import__
is such that the submodule is returned.
Further explanation
I think the semantics exist so that the most relevant module is returned. In other words, say I have a package foo
containing module bar
with function baz
. If I:
import foo.bar
Then I refer to baz
as
foo.bar.baz()
This is like __import__("foo.bar", fromlist=[])
.
If instead I import with:
from foo import bar
Then I refer to baz
as bar.baz()
Which would be similar to __imoort__("foo.bar", fromlist=["something"])
.
If I do:
from foo.bar import baz
Then I refer to baz
as
baz()
Which is like __import__("foo.bar", fromlist=["baz"])
.
So in the first case, I'd have to use the fully-qualified name, hence __import__
returns the first module name you'd use to refer to the imported elements, that being foo
. In the last case, bar
is the most specific module containing the imported elements, so it makes sense that __import__
would return the foo.bar
module.
The second case is a little weird, but I am guessing it was written that way to support importing a module using the from <package> import <module>
syntax, and in that case bar
is still the most specific module to return.