Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
387 views
in Technique[技术] by (71.8m points)

uproot - awkward array ak.unzip behaviour

When I acess a root file and extract the data I want like in following example:

events=uproot.open(filename)["btagana/ttree;6"]    
jet_data=events.arrays(filter_name=["Jet_nFirstTrack","Jet_nLastTrack","Jet_pt","Jet_phi","Jet_eta"],library="ak")

Where the sorting of the keys of this array doesn't resemble the sorting of the list used to filter the keys.If I now use ak.unzip():

jet_data=ak.unzip(jet_data)

is the sorting reliable and reproducable? If I open different root files, would I be able to achieve the same "sorting"

question from:https://stackoverflow.com/questions/65623014/awkward-array-ak-unzip-behaviour

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is actually a question about Uproot. In this line:

>>> jet_data=events.arrays(filter_name=["Jet_nFirstTrack","Jet_nLastTrack","Jet_pt","Jet_phi","Jet_eta"],library="ak")

the filter_name is just a filter, accepting or rejecting branches from the ROOT file. Those branches have a natural order in the file, and the output is probably that order (and therefore stable upon repeated attempts, unless a dict is involved at some point and you're using Python <= 3.5).

If you want to enforce an order, pass your list of branch names as expressions, rather than filter_name. That argument has different meaning: expressions can be simple formulas; filter_name can have wildcards—therefore, a character like * has very different meanings in each!

Alternatively, you can reorder the fields after reading the array by slicing with a list of strings. There's no performance penalty for doing so—it's just rearranging metadata (time to completion does not scale with the length of the array). This documentation has some examples (including more complex cases where you're selecting fields within fields, but the simple case is enough for your issue).

Edit: I should add that fields of records in Awkward Arrays have a reproducible order. They're not unstable hashmaps like dicts in Python <= 3.5. They're actually two equal length lists: the ordered fields (which is what ak.unzip returns) and ordered fields names (which ak.fields returns). The names are optional—without field names, records become tuples.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...