CHIRON scholars Astha Kapoor & Sam Moore explore the contours of what constitutes community in a biorepository context, how we might ensure representativeness, and what governance limits, if any, should be considered given the indefinite lifespan of these resources
The CHIRON toolkit takes a pragmatic approach to affirming community interests in contexts where biorepository data has already been collected and data has established governance parameters around its use. When we envision a more comprehensive approach to the biorepositories of tomorrow, we can further open the door to community interests and roles. In this week’s post, CHIRON academic workgroup members Astha Kapoor and Sam Moore explore the contours of what constitutes community in a biorepository context, how we might ensure representativeness, and what governance limits, if any, should be considered given the indefinite lifespan of these resources.
Biorepositories, vital for medical research, collect and store human biological samples and associated data for future use. However, our reliance solely on the individual consent of data contributors for biorepository data governance is becoming inadequate. Big data analysis focuses on large-scale behaviors and patterns, shifting focus from singular data points to identifying data “journeys” relevant to a collective. The individual becomes a small part of the analysis, with the harms and benefits emanating from the data occurring at an aggregated level.
Community refers to a particular qualitative aspect of a group of people that is not well captured by quantitative measures in biorepositories. This is not an excuse to dodge the question of how to account for communities in a biorepository context; rather, it shows that a framework is needed for defining different types of community that may be approached from a biorepository perspective.
Engaging with communities in biorepository governance presents several challenges. Moving away from a purely individualized understanding of governance towards a more collectivizing approach necessitates an appreciation of the messiness of group identity, its ephemerality, and the conflicts entailed therein. So while community implies a certain degree of homogeneity (i.e., that all members of a community share something in common), it is important to understand that people can simultaneously consider themselves a member of a community while disagreeing with many of its members, the values the community holds, or the positions for which it advocates. The complex nature of community participation therefore requires proper treatment for it to be useful in a biorepository governance context.
In a forthcoming white paper, we propose the following framework for defining different types of communities within the biorepository context:
Formal Communities: These are formally constituted entities with established governance structures and often legal recognition. Examples salient to the biorepository context include Tribal nations and Indigenous peoples who maintain their own governance structures (up to and including sovereignty), but may also include advocacy organizations with strong internal governance.
Informal Communities: Characterized by less obvious constitution and normative rules, informal communities include co-located people (e.g., a neighborhood or region) or those with shared lived experiences, such as the LGBTQ+ community.
Invisible Communities: Defined within biorepositories by researchers, invisible communities are governed externally, raising concerns about surveillance and consent.
Impacted Communities: Arising in response to specific events or issues within the biorepository context, impacted communities are dynamic and responsive, requiring rapid decision-making and governance procedures that allow for flexibility in a changing situation.
When we talk about communities, we often refer to a formally constituted entity. The formal community is intentionally created and maintained by the members who are willingly playing a part in its activities. Formal communities are characterized by their structure: they may have extensive bylaws governing their work, robust mechanisms for organizing their work, and/or procedures for addressing and resolving conflict. These community structures may extend to legal power, as in the case of Tribal nations.
There is a rich literature on how formal communities operate and an emerging literature on how they may govern their data. For example, the CARE principles for Indigenous Data Governance are designed to allow Indigenous Peoples to “assert greater control over the application and use of Indigenous data and Indigenous Knowledge for collective benefit.”
Similarly, there is work on data trusts and other forms of commons-based data governance that are presupposed on the idea that formal communities have concrete, self-defined ways of governing themselves and that these processes may be extended to include data governance.
Although the growing literature on data governance in formal communities is a positive development, there is still much work to be done. While it is important to not assume that the formalization of a community solves issues around collective data governance, formalized communities may have their own governance systems and representative democratic processes that allow or guide engagement on questions of data governance, as is especially the case with many Indigenous communities. It is also important to note that legally defined communities, particularly minoritized ones, often exist at the behest of the state and may be subject to laws that are not in their interest. Biorepositories should put in place systems to accommodate the governance procedures of formalized communities where possible.
In contrast to the formal community, the informal community is less obviously constituted: its members are harder to identify with clarity and its rules may be strongly normative and unspoken rather than codified. Many communities of shared identity would fall under the informal community banner, such as the LGBTQ+ community, as would people within shared localities. Informal communities are therefore fraught with misidentification, conflict and varying levels of participation and hierarchy, despite also being important sites of collective identity. There is a fuzziness around the definitions of informal communities that limits their utility in biorepository contexts.
While there is a great deal of literature on how informal communities operate, particularly in the context of the commons, there is little on how such informal groups may govern their data in a formal sense. What form of consent is needed in the absence of any formal procedures for obtaining this consent?; How often does this consent need to be ratified?; and what are the limits of community consent without formal structures? In many cases, the informal community simply highlights the need for formal structures and definitions, and so the question of consent might instead be around what the minimum viable amount of formalization is required to ensure informal communities are able to adequately participate in the governance and use of their collective data.
Invisible communities are especially salient to biorepositories, describing “communities” instantiated by researchers without the prior awareness of the people so grouped. In some cases, some sense of community (formal or informal) may already exist among people grouped by researchers. However, critically, for invisible communities there is no obvious connection between the community’s existing formal procedures or normative paths and those imposed on them as a result of their external definition by researchers. It is important to note the connotation of surveillance with the invisible community; members may be unaware of their inclusion or unable to stop it, as for Tibetan, Uyghur, and Hui “participants” recently included in several studies powered by data from a Chinese biorepository.
Yet there are also more legitimate communities that are created in this sense, such as when a genetic marker is discovered that elevates risk of a particular disease as for the BRCA “previvor” (a portmanteau of predisposition and survivor) community. Certainly informal community existed for generations among families with a very high risk of apparently inherited breast and ovarian cancer prior to the discovery of the BRCA genes in the 1990’s. However, the characterization of these genes created an invisible community of previvors who now face collective legal risks.
For invisible communities, the task of good community governance is to unearth these communities and ensure that members are aware of their existence, allowed to withdraw, and have control over their implications. Additionally, invisible communities, like all communities, are not static: they may also move into a more formalized mode of community governance. This has certainly been the case for the BRCA previvor community which rapidly
developed informal (facilitated in part through the rise of social media) and formal community structures, reclaiming legal right over their DNA sequences and spearheading ongoing efforts to reclaim governance rights over their community data hosted on social media. This point is of critical importance to researchers who assume they have the same rights to surveillance of invisible communities even after those communities have incorporated in their own right. In other words, while researcher surveillance may have “unearthed” the existence of a newly salient group of people, collective organizing within the community may require the surveillance practices of researchers to change as the community begins to assert their own wishes.
Researchers, biorepository managers, ethics and oversight committees, and funders should anticipate the creation of invisible communities through biorepository-enabled research and have systems in place to ensure that future good community governance is supported. Ideally this work would happen in advance of biorepository deposition and, given the longitudinal nature of data use within the biorepository context, processes should be developed to acknowledge and work with the realities of invisible communities post deposition. Further, researchers, biorepository managers, ethics and oversight committees, and funders must monitor for and respond to invisible communities’ sense of identity as it evolves over time. Finally, in the very strongest terms, the scientific community as a whole has a moral obligation for vigilance and action against tyrannical use of biorepository data.
The impacted community is one created out of necessity due to a particular event or issue that affects them. Unlike invisible communities, impacted communities are created by the event and are governed internally in response to it. Because they are brought together by virtue of external factors rather than internally shared characteristics, impacted communities can be highly heterogeneous. These kinds of communities need rapid decision-making abilities and the ability to move quickly in response to the issue(s) that affect them. Biorepositories are uniquely positioned to activate such communities.
One example of an impacted community was instantiated as a result of the Spectrum 10K study. This study-cum-biorepository aimed to collect the genomes of 10,000 people with autism living in the UK in order to understand “genetic and environmental factors that contribute to autism and autistic people’s health.” Individual people with autism (or their caregivers) consented and contributed samples. Yet the study was paused after participants collectively expressed “concerns about the study’s lack of transparency around its aims, and worries over conflicts of interest, consent issues and the long-term use of biodata gained from the saliva sample.” Participants grouped together by this study became a community as they were impacted by the biorepository itself, eventually leading to the study’s suspension.
Impacted communities may be a site of conflict, not least because members are brought together in response to something that participants may want dealt with in different ways. Impacted communities are highly changeable and temporal, requiring continuous participation by community members in response to changing events. These kinds of communities point to the need for responsive forms of inclusion within biorepository data governance rather than a generic process that can be indiscriminately rolled out for each community.
It is important to consider the ways in which researchers create and understand different communities as part of their research. While the above framework refers to different ways of understanding communities, researchers need to pay attention to different ways in which the community may evolve or be understood as a singular, static entity. For example, the question of scale–often a primary driver of researchers to biorepositories–is an important one not only from a computational perspective: considering communities’ ability to “consent” to biorepository research requires that community is adequately defined in terms of scale to have meaningful voice in questions of data governance. If the scale of the perceived community is too big, it will be inappropriately presented as one whole entity that may, in fact, be better represented as multiple communities. However, if the community is too narrow in scope or selective, it will inappropriately represent the interests of a minority as those of the majority. Identifying the right size and scale of the community will aid representation, but biorepository governance still ideally requires active and continuous consent from community members.
Similarly, for how long is a community definition appropriate and for how long may a community member feel represented by their community? This question is particularly fraught for biorepository data that is designed to exist in perpetuity. The answer is a decision based on pragmatics. Part of the process of formally becoming a community member is exchanging some degree of autonomy in place of the collective will. What, therefore, is the appropriate amount of time that a member may defer to the collective will? A related question is on the remit of the community in matters of representation. What is the community allowed to decide on behalf of its members? What issues require a democratic vote, as opposed to mere representation? These questions speak to the need to not homogenize the views or essentialize the identities of community members, instead designing structures around levels of risk of the decision and the importance of an issue to the community. This is what makes a community different from a group selected because of a shared characteristic. Community membership requires active consent and continuous participation for it to be successful.
We believe that researchers must be made aware of, and acknowledge the limits of, individual consent in the context of biorepositories and consider how individual mechanisms of consent interface with ideas of group consent. Researchers should also develop methodologies for community consent which should account for representations (both who is represented and how they are able to act on representation) in the framework of communities described above. Researchers should be aware that the creation and use of datasets within biorepositories often is an act of community making itself, and that people get put into groups that share qualities unbeknownst to them. Therefore, it is the responsibility of researchers and biorepositories to visibilize this process and ensure that communities have the opportunity to provide input on how their data is used.