Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
660 views
in Technique[技术] by (71.8m points)

scikit learn - Using hyphen/dash in python repository name and package name

I am trying to make my git repository pip-installable. In preparation for that I am restructuring the repo to follow the right conventions. My understanding from looking at other repositories is that I should put all my source code in a package that has the same name as the repository name. E.g. if my repository is called myrepo, then the source code would all go into a package also called myrepo.

My repository has a hyphen in it for readability: e.g. my-repo. So if I wanted to make a package for it with the same name, it would have a hyphen in it as well. In this tutorial it says "don't use hyphens" for python package names. However I've seen well-established packages such as scikit-learn that have hyphens in their name. One thing that I have noticed though is that in the scikit-learn repo, the package name is not the same as the repo name and is instead called sklearn.

I think my discussion above boils down to the following questions:

  1. When packaging a repo, what is the relationship between the repository's name and the package's name? Is there anything to beware of when having names that don't match?
  2. Is it okay to have hyphens in package names? What about in repository names?
  3. If the package name for scikit-learn is sklearn, then how come when I install it I do pip install scikit-learn instead of pip install sklearn?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To answer your 1st point let me rephrase my answer to a different question.

The biggest source of misunderstanding is that the word "package" is heavily overloaded. There are 4 different names in the game — the name of the repository, the name of the directory being used for development (the one that contains setup.py), the name of the directory containing __init__.py and other importable modules, the name of distribution at PyPI. Quite often these 4 are the same or similar but that's not required.

The names of the repository and development directory can be any, their names don't play any role. Of course it's convenient to name them properly but that's only convenience.

The name of the directory with Python files name the package to be imported. Once the package is named for import the name usually stuck and cannot be changed.

The name of the distribution gives one a page at PyPI and the name of distribution files (source distribution, eggs, wheels). It's the name one puts in setup(name='distribution') call.

Let me show detailed real example. I've been maintaining a templating library called CheetahTemplate. I develop it in the development directory called cheetah3/. The distribution at PyPI is called Cheetah3; this is the name I put into setup(name='Cheetah3'). The top-level module is Cheetah hence one does import Cheetah.Template or from Cheetah import Template; that means that I have a directory cheetah3/Cheetah/.

The answer to 2 is: you can have dashes in repository names and PyPI distribution names but not in package (directories with __init__.py files) names and module (.py files) names because you cannot write in Python import xy-zzy, that would be subtraction and SyntaxError.

Point 3: The site and the repository names are scikit-learn, as well as the distribution name, but the importable package (the top-level directory with __init__.py) is sklearn.

PEP 8 has nothing to do with the question as it doesn't talk about distribution, only about importable packages and modules.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...