Mailing List Archive

Select columns based on dates - Pandas
Hello, forum,
I have a data frame with covid-19 cases per month from 2019 - 2021 like a header like this:

Country, 01/01/2019, 2/01/2019, 01/02/2019, 3/01/2019, ... 01/01/2021, 2/01/2021, 01/02/2021, 3/01/2021

I want to filter my data frame for columns of a specific month range of march to September of 2019, 2020, and 2021 only (three data frames).

Any ideas?
Thank you


--
https://mail.python.org/mailman/listinfo/python-list
Re: Select columns based on dates - Pandas [ In reply to ]
You may want to reshape the dataset to a tidy format: Pandas works
better with that format.

Let's assume the following dataset (this is what I understood from your
message):

In [34]: df = pd.DataFrame({
...: 'Country': ['us', 'uk', 'it'],
...: '01/01/2019': [10, 20, 30],
...: '02/01/2019': [12, 22, 32],
...: '03/01/2019': [14, 24, 34],
...: })

In [35]: df
Out[35]:
Country 01/01/2019 02/01/2019 03/01/2019
0 us 10 12 14
1 uk 20 22 24
2 it 30 32 34

Then, reshape it to a tidy format. Notice how each row now represents
a single measure.

In [43]: pd.melt(df, id_vars=['Country'], var_name='Date',
value_name='Cases')
Out[43]:
Country Date Cases
0 us 01/01/2019 10
1 uk 01/01/2019 20
2 it 01/01/2019 30
3 us 02/01/2019 12
4 uk 02/01/2019 22
5 it 02/01/2019 32
6 us 03/01/2019 14
7 uk 03/01/2019 24
8 it 03/01/2019 34

I used strings to represent the dates but it is much handy work
with real date objects.

In [44]: df2 = _
In [45]: df2['Date'] = pd.to_datetime(df2['Date'])

Now we can filter by date:

In [50]: df2[df2['Date'] < '2019-03-01']
Out[50]:
Country Date Cases
0 us 2019-01-01 10
1 uk 2019-01-01 20
2 it 2019-01-01 30
3 us 2019-02-01 12
4 uk 2019-02-01 22
5 it 2019-02-01 32

With that you could create three dataframes, one per month.

Thanks,
Martin.


On Thu, Sep 02, 2021 at 12:28:31PM -0700, Richard Medina wrote:
>Hello, forum,
>I have a data frame with covid-19 cases per month from 2019 - 2021 like a header like this:
>
>Country, 01/01/2019, 2/01/2019, 01/02/2019, 3/01/2019, ... 01/01/2021, 2/01/2021, 01/02/2021, 3/01/2021
>
>I want to filter my data frame for columns of a specific month range of march to September of 2019, 2020, and 2021 only (three data frames).
>
>Any ideas?
>Thank you
>
>
>--
>https://mail.python.org/mailman/listinfo/python-list
--
https://mail.python.org/mailman/listinfo/python-list
Re: Select columns based on dates - Pandas [ In reply to ]
On Friday, September 3, 2021 at 11:57:12 AM UTC-5, Martin Di Paola wrote:
> You may want to reshape the dataset to a tidy format: Pandas works
> better with that format.
>
> Let's assume the following dataset (this is what I understood from your
> message):
>
> In [34]: df = pd.DataFrame({
> ...: 'Country': ['us', 'uk', 'it'],
> ...: '01/01/2019': [10, 20, 30],
> ...: '02/01/2019': [12, 22, 32],
> ...: '03/01/2019': [14, 24, 34],
> ...: })
>
> In [35]: df
> Out[35]:
> Country 01/01/2019 02/01/2019 03/01/2019
> 0 us 10 12 14
> 1 uk 20 22 24
> 2 it 30 32 34
>
> Then, reshape it to a tidy format. Notice how each row now represents
> a single measure.
>
> In [43]: pd.melt(df, id_vars=['Country'], var_name='Date',
> value_name='Cases')
> Out[43]:
> Country Date Cases
> 0 us 01/01/2019 10
> 1 uk 01/01/2019 20
> 2 it 01/01/2019 30
> 3 us 02/01/2019 12
> 4 uk 02/01/2019 22
> 5 it 02/01/2019 32
> 6 us 03/01/2019 14
> 7 uk 03/01/2019 24
> 8 it 03/01/2019 34
>
> I used strings to represent the dates but it is much handy work
> with real date objects.
>
> In [44]: df2 = _
> In [45]: df2['Date'] = pd.to_datetime(df2['Date'])
>
> Now we can filter by date:
>
> In [50]: df2[df2['Date'] < '2019-03-01']
> Out[50]:
> Country Date Cases
> 0 us 2019-01-01 10
> 1 uk 2019-01-01 20
> 2 it 2019-01-01 30
> 3 us 2019-02-01 12
> 4 uk 2019-02-01 22
> 5 it 2019-02-01 32
>
> With that you could create three dataframes, one per month.
>
> Thanks,
> Martin.
> On Thu, Sep 02, 2021 at 12:28:31PM -0700, Richard Medina wrote:
> >Hello, forum,
> >I have a data frame with covid-19 cases per month from 2019 - 2021 like a header like this:
> >
> >Country, 01/01/2019, 2/01/2019, 01/02/2019, 3/01/2019, ... 01/01/2021, 2/01/2021, 01/02/2021, 3/01/2021
> >
> >I want to filter my data frame for columns of a specific month range of march to September of 2019, 2020, and 2021 only (three data frames).
> >
> >Any ideas?
> >Thank you
> >
> >
> >--
> >https://mail.python.org/mailman/listinfo/python-list
Thank you, Martin,
Is it possible to select/filter the dates (like columns from a date range) from the columns without reshaping them?
Thank you for your answer.
--
https://mail.python.org/mailman/listinfo/python-list