Data collection Notebook

Location for code processing and saving data from studies into tidy format

1. AUS Reidentification with AGAB Paper

1.1 Paper Extract

1.2 Create and save df as csv

2. US/CAN Levels of Satisfaction Study

2.1 Paper Extract

2.2 Create and save df as csv

import pandas as pd

1. AUS Reidentification with Sex Assigned at Birth

1.1 Paper Extract

LINK to table of contents

From the (Original Study) Reidentification With Birth-Registered Sex in a Western Australian Pediatric Gender Clinic Cohort

(Emphasis added by me)

Results Of 552 closed referrals during the study period, a reason for closure could be determined for 548 patients, including 211 birth-registered males (mean [SD] age, 13.88 [2.00] years) and 337 birth-registered females (mean [SD] age, 15.81 [2.22] years). Patients who reidentified with their birth-registered sex comprised 5.3% (29 of 548; 95% CI, 3.6%-7.5%) of all referral closures. Except for 2 patients, reidentification occurred before or during early stages of assessment (93.1%; 95% CI, 77.2%-99.2%). Two patients who reidentified with their birth-registered sex did so following initiation of puberty suppression or gender-affirming hormone treatment (1.0% of 196 patients who initiated any gender-affirming medical treatment; 95% CI, 0.1%-3.6%).

JAMA Pediatr. 2024;178(5):446-453. doi:10.1001/jamapediatrics.2024.0077

1.2 Create and save df as csv

LINK to table of contents

## make alluvial table

# we need one column to be the axis1 - the total referrals, as a single block
# axis2 - reason for closure determined or not
# axis3 - early assessment stage (single block)
# axis4 - GAHT or reid at this stage (2 blocks)
# axis5 - carried on with GAHT or Reid at this stage (2 blocks)
# n - the frequency column
df = pd.DataFrame(
    {
       'Total Referrals':['Closed referrals']*5,
       'Closure_reason':['Unknown']+ ['Known']*4,
       'Assessment':[None] + ['Early assessment stage']*4,
       'Next_stage': [None]*2 + ['Reidentification w AGAB'] + ['Started GAHT']*2,
       'Treatment_stage':[None]*3+ ['Reidentification w AGAB']+['Carried on with GAHT'],
       'N':[4, 548, 27, 2, 548-(27+2) ]
    }, 
    index=list(range(5))
)#.T.reset_index()#.melt(id_vars=['index'], var_name='AGAB').rename({'index':'Group'}, axis=1 )
df.to_csv('data/aus_reid_study/alluvial_research_paper_outcomes_table_by_AGAB.csv')
df

	Total Referrals	Closure_reason	Assessment	Next_stage	Treatment_stage	N
0	Closed referrals	Unknown	None	None	None	4
1	Closed referrals	Known	Early assessment stage	None	None	548
2	Closed referrals	Known	Early assessment stage	Reidentification w AGAB	None	27
3	Closed referrals	Known	Early assessment stage	Started GAHT	Reidentification w AGAB	2
4	Closed referrals	Known	Early assessment stage	Started GAHT	Carried on with GAHT	519

df = pd.DataFrame(
    {
       'Assessment':['Early Assessment Stage']*3,
       'Next_stage': ['Detrans pre-treatment'] + ['Treatment']*2,
       'Treatment_stage':['Detrans pre-treatment']+ ['Detrans during treatment']+['Treatment'],
       'N':[27, 2, 548-(27+2) ]
    }, 
    index=list(range(3))
)#.T.reset_index()#.melt(id_vars=['index'], var_name='AGAB').rename({'index':'Group'}, axis=1 )
df.to_csv('data/aus_reid_study/alluvial_research_paper_outcomes_table_by_AGAB_post_assessment.csv')
df

	Assessment	Next_stage	Treatment_stage	N
0	Early Assessment Stage	Detrans pre-treatment	Detrans pre-treatment	27
1	Early Assessment Stage	Treatment	Detrans during treatment	2
2	Early Assessment Stage	Treatment	Treatment	519

# create the simple waffle chart data

df = pd.DataFrame(
    {
       'Category':['Unknown outcome', 
                   'Reid w AGAB at early assessment stage', 'Reid w AGAB after early assessment stage',
                    'Still pursuing GAHT'],
        'Lay_Category':['Unknown result', 'Detrans before hormones', 'Detrans after hormones',
                        'Still transgender'
                        ],
        'N':[4, 27, 2, 519], 
        'colour':[ 'grey', 'pink', 'red', 'green'],
        'Received any hormonal treatment':['No / Unknown', 'No / Unknown', 'Yes', 'Yes']
    }, 
)

df['n_100'] = round(100*df['N']/df['N'].sum())
df['n_1000'] = round(1000*df['N']/df['N'].sum())

df.to_csv('data/aus_reid_study/waffle_chart_data_jama_study.csv')
df

	Category	Lay_Category	N	colour	Received any hormonal treatment	n_100	n_1000
0	Unknown outcome	Unknown result	4	grey	No/unknown	1.0	7.0
1	Reid w AGAB at early assessment stage	Detrans before hormones	27	pink	No/unknown	5.0	49.0
2	Reid w AGAB after early assessment stage	Detrans after hormones	2	red	Yes	0.0	4.0
3	Still pursuing GAHT	Still transgender	519	green	Yes	94.0	940.0

## post GAHT referrals:

df = pd.DataFrame(
    {
        'Total_referrals_closed':[548],
       'Total_referrals_w_GAHT':[196],  
       'Early_assessment_reid_w_AGAB':[29],
       'Post_GAHT_Reidentification_w_AGAB':[2], 
    }, 
    index=[0]
).T.reset_index().rename({'index':'Group', 0:'N'}, axis=1)#.melt(id_vars=['index'],).rename({'index':'Group'}, axis=1 )
df.to_csv('data/aus_reid_study/jama_aus_reid_simple.csv')
df

	Group	N
0	Total_referrals_closed	548
1	Total_referrals_w_GAHT	196
2	Early_assessment_reid_w_AGAB	29
3	Post_GAHT_Reidentification_w_AGAB	2


df = pd.DataFrame(
    {
       'Total_referrals':[653, 342],  
       'Referral_active':[122, 50], 
       'Referral_waitlisted':[167, 59], 
       'Referral_Admin_Closure':[25,20],
       'Referral_Closed': [339, 213],
        'Referral_Closure_Reason_Not_Determined':[2,2],
        'Referral_Closure_Reidentification_w_AGAB':[20,9],
        'Referral_Closure_Other':[317,202]
    }, 
    index=['AFAB', 'AMAB']
).T.reset_index().melt(id_vars=['index'], var_name='AGAB').rename({'index':'Group'}, axis=1 )
df.to_csv('data/aus_reid_study/research_paper_outcomes_table_by_AGAB.csv')
df

	Group	AGAB	value
0	Total_referrals	AFAB	653
1	Referral_active	AFAB	122
2	Referral_waitlisted	AFAB	167
3	Referral_Admin_Closure	AFAB	25
4	Referral_Closed	AFAB	339
5	Referral_Closure_Reason_Not_Determined	AFAB	2
6	Referral_Closure_Reidentification_w_AGAB	AFAB	20
7	Referral_Closure_Other	AFAB	317
8	Total_referrals	AMAB	342
9	Referral_active	AMAB	50
10	Referral_waitlisted	AMAB	59
11	Referral_Admin_Closure	AMAB	20
12	Referral_Closed	AMAB	213
13	Referral_Closure_Reason_Not_Determined	AMAB	2
14	Referral_Closure_Reidentification_w_AGAB	AMAB	9
15	Referral_Closure_Other	AMAB	202

df_pivot = df.pivot(columns='Group', index='AGAB', values='value').reset_index()
df_pivot = df_pivot[['AGAB', 'Referral_Closed', 'Referral_Closure_Other',   
                     'Referral_Closure_Reason_Not_Determined',  
                     'Referral_Closure_Reidentification_w_AGAB']]
df_pivot.loc['Total'] = df_pivot.sum(numeric_only=True)
df_pivot.fillna('AFAB and AMAB', inplace=True)
for c in df_pivot.columns[2:]:
    df_pivot[f'{c}_pct'] = round(100*df_pivot[c]/df_pivot['Referral_Closed'], 2)

df_pivot.to_csv('data/aus_reid_study/df_pct_pivot.csv')
df_pivot

Group	AGAB	Referral_Closed	Referral_Closure_Other	Referral_Closure_Reason_Not_Determined	Referral_Closure_Reidentification_w_AGAB	Referral_Closure_Other_pct	Referral_Closure_Reason_Not_Determined_pct	Referral_Closure_Reidentification_w_AGAB_pct
0	AFAB	339.0	317.0	2.0	20.0	93.51	0.59	5.90
1	AMAB	213.0	202.0	2.0	9.0	94.84	0.94	4.23
Total	AFAB and AMAB	552.0	519.0	4.0	29.0	94.02	0.72	5.25

df_hmap = df.groupby('Group').sum().reset_index()

df_hmap

	Group	value
0	Referral_Admin_Closure	45
1	Referral_Closed	552
2	Referral_Closure_Other	519
3	Referral_Closure_Reason_Not_Determined	4
4	Referral_Closure_Reidentification_w_AGAB	29
5	Referral_active	172
6	Referral_waitlisted	226
7	Total_referrals	995

#make heatmap df

df_hmap = df.loc[df.Group.isin([
                                'Referral_Closed', 
                                'Referral_Closure_Reason_Not_Determined',
                                'Referral_Closure_Reidentification_w_AGAB',
                                'Referral_Closure_Other'])]
display(df_hmap)

df_hmap_agg = df_hmap.groupby(['Group']).sum()
display(df_hmap_agg)

	Group	AGAB	value
4	Referral_Closed	AFAB	339
5	Referral_Closure_Reason_Not_Determined	AFAB	2
6	Referral_Closure_Reidentification_w_AGAB	AFAB	20
7	Referral_Closure_Other	AFAB	317
12	Referral_Closed	AMAB	213
13	Referral_Closure_Reason_Not_Determined	AMAB	2
14	Referral_Closure_Reidentification_w_AGAB	AMAB	9
15	Referral_Closure_Other	AMAB	202

	value
Group
Referral_Closed	552
Referral_Closure_Other	519
Referral_Closure_Reason_Not_Determined	4
Referral_Closure_Reidentification_w_AGAB	29

2.1 Trans Youth Survey Regret Rates

From the abstract in Levels of Satisfaction and Regret With Gender-Affirming Medical Care in Adolescence

Importance There is a need to improve the evidence base for gender-affirming medical care provided to adolescents, including the experiences of those who have received this care.

Objective To examine rates of satisfaction, regret, and continuity of care in adolescents who received puberty blockers and/or gender-affirming hormones as part of gender-affirming medical care.

Design, Setting, and Participants This survey study used the 2023 online survey wave of an ongoing longitudinal study, the Trans Youth Project, among a community-based sample of transgender youth and their parents initially recruited throughout the US and Canada between 2013 and 2017. The satisfaction and regret data include responses from a youth or their parent representing 87% of the youth aged 12 years or older in the cohort who have received gender-affirming medical care (235 of 269 youths). Of these, 220 completed the 2023 survey (main sample); information about continuity of care was available for all youth. Data analysis was performed from April to August 2024.

Exposure Satisfaction, regret, and continuity of care following puberty blockers or suppression and/or gender-affirming hormones.

Main Outcomes and Measures Self- or parent-reported satisfaction or regret with gender-affirming care and continuation of care. [ … ]

Conclusions and Relevance The findings suggest that youth accessing puberty blockers and hormones as part of gender-affirming care tend to be satisfied with and not regretful of that care several years later. While regret was rare, these experiences need to be better understood.

2.1 Paper Extract relevant to the data (emphasis mine)

Results: Among the 220 youths in the main sample (mean [SD] age, 16.07 [2.40] years; 30 [14%] multiracial, non-Hispanic; 18 [8%] White, Hispanic; 155 [70%] White, non-Hispanic; 17 [8%] other race and ethnicity, including Asian, Black [Hispanic and non-Hispanic], Hispanic with unknown race, multiracial Hispanic, or Native American; gender at last interaction: 68 [31%] boys, 132 [60%] girls, 20 [9%] gender diverse, eg, nonbinary) and their parents, very high levels of satisfaction and low levels of regret with puberty blockers and gender-affirming hormones as well as high levels of continuation of care were reported. Of these 220 respondents in the main sample, 9 were regretful of having received blockers (n = 8) and/or hormones (n = 3; 2 of these individuals reported regret with both), of whom 4 have stopped all gender-affirming medical care and 1 has continued to receive blockers but plans to stop. The 4 others have continued care, suggesting that regret is not synonymous with stopping care.

LINK to table of contents

df = pd.DataFrame(
    {
        'Total youth':[269],
        'Total youth who have received GAC':[235],
       'Total respondents':[220],  
       'Boys':[68], 
       'Girls':[132], 
       'Gender diverse':[20],
       'Total regretful': [9],
        'Regretful and discontinued':[4],
        'Regretful and continued treatment':[4],
    }, 
    index=[0]).T.reset_index()
df.columns = ['Group', 'Nr']
df.to_csv('data/trans_youth_project_study/regret_agg_results_by_gender_TG_Youth_survey.csv')
df

	Group	Nr
0	Total youth	269
1	Total youth who have received GAC	235
2	Total respondents	220
3	Boys	68
4	Girls	132
5	Gender diverse	20
6	Total regretful	9
7	Regretful and discontinued	4
8	Regretful and continued treatment	4

## although it is not a fully baked idea, an argument against studies showing positive results for GAC outcomes is that 
# respondents are automatically more likely to be happier, whereas those who regret treatment are more likely to disengage. 
# if we just assume for a moment that ALL non-respondents  must be regretful, the figures still paint a very positive picture

df.loc[len(df)] = ["Youth who received GAC and didn't respond", 235-220]
df.loc[len(df)] = ["Youth ASSUMED to have regrets", 235-220+9]
df.loc[len(df)] = ['Youth NOT regretful', 220 - (235-220+9)]
df

	Group	Nr
0	Total youth	269
1	Total youth who have received GAC	235
2	Total respondents	220
3	Boys	68
4	Girls	132
5	Gender diverse	20
6	Total regretful	9
7	Regretful and discontinued	4
8	Regretful and continued treatment	4
9	Youth who received GAC and didn't respond	15
10	Youth ASSUMED to have regrets	24
11	Youth NOT regretful	196

## ergo, the final rate for regret would still be 
print(f'Youth Regret rate for GAC: {round(100*24/235, 2)}%')

## that's a rate of 1 in 10, which, based on all other studies, is a very high estimate that flies above the rates given by other studies
## in reality, those non-responses cannot be automatically counted this way, and the actual regret rate is more 
print(f'Actual study youth regret rate: {round(100*9/220, 2)}%')

## throw in next the fact that not all those who expressed regret decided to discontinue GAC, for them, that rate is
print(f'Rate of Youth stopping GAC due to regret: {round(100*4/220, 2)}%')

# If this study is representative (which it might not be), then we'd see regret occur in 4 in 100 / 1 in 25 cases, and only 1 in 50 cases would 
# require cessation of GAC

Youth Regret rate for GAC: 10.21%
Actual study youth regret rate: 4.09%
Rate of Youth stopping GAC due to regret: 1.82%