Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
502 views
in Technique[技术] by (71.8m points)

pandas - Python: UNION multiple dataframe tables with pd.concat

I want to UNION multiple tables (dataframe tables from pandas), of which it is unclear how many tables it will be (depends on the unique YearMonthNumbers in the dataset). I would like to pd.concat([df1],[df2],etc), however it is unclear to me how I can adjust the variable names 'df' and assign the 'i'th value to it, to be able to pandas.concat it later. Below query returns a table per month.

import json
import pandas as pd
import datetime
import calendar

def add_months(sourcedate, months):
    month = sourcedate.month - 1 + months
    year = sourcedate.year + month // 12
    month = month % 12 + 1
    day = min(sourcedate.day, calendar.monthrange(year,month)[1])
    return datetime.date(year, month, day)

def union_dataframes():
    startDate = datetime.datetime.strptime('2020-12-01', "%Y-%m-%d").date() 
    i=1

    while startDate < datetime.datetime.now().date():

        data = '''
        {
            "columnHeaders": [
                {
                    "name": "country",
                },
                {
                    "name": "amount",
                },
                {
                    "name": "quantity",
                }
            ],
            "rows": [
                [
                    "NL",
                    428226,
                    22738
                ]
            ]
        }
        '''
        data = json.loads(data)


        columns = [dct['name'] for dct in data['columnHeaders']]
        df = pd.DataFrame(data['rows'], columns=columns)
        df['month'] = startDate

        print(df) #prints table per month, that I want to assign to df1, df2, df3, etc.
        startDate = add_months(startDate,1)
        i += 1

union_dataframes()

Any suggestions how I can pull this of? Or maybe there is a different way to UNION these tables.

Kind regards, Igor


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I marked new code with #<<<<< Collect dfs in a list and concatenate at the end. You probably don't need the counter i, either.

def union_dataframes():
    df_hold_list = [] #<<<<<
    startDate = datetime.datetime.strptime('2020-12-01', "%Y-%m-%d").date() 
    i=1
    
    while startDate < datetime.datetime.now().date():

        data = '''
        {
            "columnHeaders": [
                {
                    "name": "country",
                },
                {
                    "name": "amount",
                },
                {
                    "name": "quantity",
                }
            ],
            "rows": [
                [
                    "NL",
                    428226,
                    22738
                ]
            ]
        }
        '''
        data = json.loads(data)


        columns = [dct['name'] for dct in data['columnHeaders']]
        df = pd.DataFrame(data['rows'], columns=columns)
        df['month'] = startDate
        df_hold_list.append(df) #<<<<<
        print(df) #prints table per month, that I want to assign to df1, df2, df3, etc.
        startDate = add_months(startDate,1)
        i += 1
    return df_hold_list #<<<<<

df_hold_list = union_dataframes() #<<<<<
df_union = pd.concat(df_hold_list, axis=0) #<<<<<

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...