Payment Analysis#
Number of Payments#
pb.configure(
df = df_payments
, metric = 'payment_sequential'
, metric_label = 'Share of Payments'
, metric_label_for_distribution = 'Number of Payments'
, agg_func = 'count'
, norm_by = 'all'
, axis_sort_order='descending'
, text_auto='.1%'
, update_fig={'xaxis': {'tickformat': '.0%'}}
)
print(f'Total number of payments in order: {df_payments.payment_sequential.count():,}')
Total number of payments in order: 103,520
Let’s look at the statistics and distribution of the number of payments per day.
tmp_df_res = (
df_orders.merge(df_payments, on='order_id', how='left')
.groupby(pd.Grouper(key='order_purchase_dt', freq='D'), observed=False)['payment_sequential']
.count()
.to_frame('payments_cnt_per_day')
)
tmp_df_res['payments_cnt_per_day'].explore.info(
labels=dict(payments_cnt_per_day='Number of Payments per Day')
, title='Distribution of Number of Payments per Day'
)
Summary | Percentiles | Detailed Stats | Value Counts | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Total | 604 (100%) | Max | 1.21k | Mean | 171.39 | 202 | 10 (2%) | |||
Missing | --- | 99% | 379 | Trimmed Mean (10%) | 166.52 | 160 | 9 (1%) | |||
Distinct | 268 (44%) | 95% | 312.55 | Mode | 202 | 80 | 7 (1%) | |||
Non-Duplicate | 102 (17%) | 75% | 229 | Range | 1213 | 147 | 7 (1%) | |||
Duplicates | 336 (56%) | 50% | 160 | IQR | 121 | 138 | 6 (<1%) | |||
Dup. Values | 166 (27%) | 25% | 108 | Std | 93.56 | 204 | 6 (<1%) | |||
Zeros | --- | 5% | 47.15 | MAD | 84.51 | 108 | 6 (<1%) | |||
Negative | --- | 1% | 8.12 | Kurt | 24.75 | 122 | 5 (<1%) | |||
Memory Usage | <1 Mb | Min | 1 | Skew | 2.58 | 128 | 5 (<1%) |

Key Observations:
75% of days have ≤230 payments
Top 5% have ≥312 payments
Several days exceeded 500 payments
Let’s look at top days.
tmp_df_res.sort_values('payments_cnt_per_day', ascending=False).head()
payments_cnt_per_day | |
---|---|
order_purchase_dt | |
2017-11-24 | 1214 |
2017-11-25 | 538 |
2017-11-27 | 420 |
2017-11-26 | 411 |
2017-11-28 | 393 |
Key Observations:
Black Friday had anomalous payment volumes
By Payment Type
pb.bar_groupby(y='payment_type', to_slide=True)

Key Observations:
Payment method distribution:
Credit card: 74%
Boleto: 19%
Voucher: 5.5%
Debit card: 1.5%
Payment Value#
pb.configure(
df = df_payments
, metric = 'payment_value'
, metric_label = 'Average Payment Value, R$'
, metric_label_for_distribution = 'Payment Value, R$'
, agg_func = 'mean'
, axis_sort_order='descending'
, text_auto='.3s'
)
Let’s see at statistics and distribution of the metric.
pb.metric_info(
upper_quantile=0.95
, hist_mode='dual_hist_trim'
)
Summary | Percentiles | Detailed Stats | Value Counts | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Total | 103.52k (100%) | Max | 13.66k | Mean | 154.02 | 50 | 323 (<1%) | |||
Missing | --- | 99% | 1.04k | Trimmed Mean (10%) | 114.80 | 20 | 272 (<1%) | |||
Distinct | 28.99k (28%) | 95% | 437.19 | Mode | 50 | 100 | 254 (<1%) | |||
Non-Duplicate | 13.06k (13%) | 75% | 171.77 | Range | 13.66k | 77.57 | 250 (<1%) | |||
Duplicates | 74.53k (72%) | 50% | 100 | IQR | 114.99 | 35 | 165 (<1%) | |||
Dup. Values | 15.93k (15%) | 25% | 56.78 | Std | 217.46 | 73.34 | 160 (<1%) | |||
Zeros | 8 (<1%) | 5% | 26.13 | MAD | 76.40 | 30 | 133 (<1%) | |||
Negative | --- | 1% | 6.71 | Kurt | 242.72 | 116.94 | 131 (<1%) | |||
Memory Usage | 1 | Min | 0 | Skew | 9.27 | 56.78 | 122 (<1%) |

Key Observations:
75% of payments are ≤172 R$
Top 5% are ≥440 R$
Several exceed 6,000 R$
By Payment Type
pb.bar_groupby(y='payment_type', to_slide=True)

Key Observations:
Credit card payments have highest average value
Vouchers have lowest
Number of Payment Installments#
pb.configure(
df = df_payments
, metric = 'payment_installments'
, metric_label = 'Average Payment Installments'
, metric_label_for_distribution = 'Payment Installments'
, agg_func = 'mean'
, axis_sort_order='descending'
, text_auto='.2s'
)
Let’s see at statistics and distribution of the metric.
pb.metric_info()
Summary | Percentiles | Detailed Stats | Value Counts | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Total | 103.52k (100%) | Max | 24 | Mean | 2.85 | 1 | 52.38k (51%) | |||
Missing | --- | 99% | 10 | Trimmed Mean (10%) | 2.28 | 2 | 12.38k (12%) | |||
Distinct | 23 (<1%) | 95% | 10 | Mode | 1 | 3 | 10.42k (10%) | |||
Non-Duplicate | 2 (<1%) | 75% | 4 | Range | 23 | 4 | 7.07k (7%) | |||
Duplicates | 103.50k (99%) | 50% | 1 | IQR | 3 | 10 | 5.29k (5%) | |||
Dup. Values | 21 (<1%) | 25% | 1 | Std | 2.69 | 5 | 5.22k (5%) | |||
Zeros | --- | 5% | 1 | MAD | 0 | 8 | 4.26k (4%) | |||
Negative | --- | 1% | 1 | Kurt | 2.56 | 6 | 3.90k (4%) | |||
Memory Usage | 1 | Min | 1 | Skew | 1.66 | 7 | 1.61k (2%) |

Key Observations:
51% of payments are single-installment
Some have ≥9 installments
By Payment Type
pb.bar_groupby(y='payment_type', to_slide=True)

Key Observations:
Credit cards average 3.5 installments
Other methods have no installments