OSS Questionnaire
1. Background of the Report
Since the release of the "2015 China Open Source Community Participation Survey Report" at the beginning of 2016, kaiyuanshe has continuously published annual open source developer survey reports, aiming to present the current status and trends of China's open source development from multiple dimensions. In 2024, we continue this tradition, and with the help of data analysis methods and survey tools, further map out the landscape of China's open source world, assisting the open source community, developers, and industry professionals in understanding the changes in the domestic open source ecosystem.
This survey will continue to focus on the participation of various levels within the open source community, aiming to gain an in-depth understanding of the respondents' personal information, work status, participation in open source communities, and developers' technical backgrounds through multidimensional questions. The survey is designed with multiple role levels based on the depth of participation in the open source community, including users, participants, contributors, maintainers, and ecosystem operators, to comprehensively reflect the participation and influence of users at different levels within the open source community. The specific definitions are as follows:
- User: A user who has used one or more open source products.
- Participant: A user who interacts with the open source community (for example, engaging in communication with the open source community, participating in community-organized activities, etc.).
- Contributor: A user who has made substantial contributions to the open source community (including code and non-code contributions).
- Maintainer: A user mainly responsible for the daily operation of the open source community (including project maintainers, PMC members, etc.).
Additionally, ecosystem operators are users mainly responsible for the daily operation of the open source community, positioned above participants, and together with maintainers, are collectively referred to as operators.
Similar to previous years, this survey, in addition to collecting basic information, also includes targeted questions for different role groups to gain a deeper understanding of the participation motivations, contribution models, and influence of users at various levels.
Basic Information of this survey is as follows:
- Target Audience: Developers, community members, contributors, students, government and corporate managers
- Survey Content: Mainly covers personal information, work status, open source community participation, and developers' technical background
- Survey Method: Sample and data collection via online questionnaire, data analyzed using cross-comparison
- Distribution Channels: Online promotion via official accounts, and offline distribution through Open Source Society, OSCAR China Industry Conference, PyCon, 2024 9th China Open Source Annual Conference, and other multi-channel platforms.
- Question Types: Single choice, multiple choice, open-ended
- Number of Questions: 41
- Sample Size: 631
2. Preview of Questionnaire Results
Respondent Characteristics
- Age and Gender: The respondents' age ranges from under 21 to over 50, with a balanced gender ratio, with male and female respondents each accounting for half, reflecting the diversity of the sample.
- Educational Background: Respondents' educational levels range from undergraduate to master's and doctoral degrees, indicating a generally high level of education.
- Occupational Identity: The respondents' professional identities are diverse, including students, developers, technical managers, architects, data engineers, and analysts, covering various fields in the IT industry.
- Geographical Location: Respondents are from multiple provinces and cities across the country, such as Beijing, Shanghai, Guangdong, and Zhejiang, providing good geographical representation.
Open Source Participation
- Open Source Exposure Duration: Respondents' exposure to open source ranges from less than 1 year to more than 10 years, showing a mix of new and long-time members in the open source community.
- Reasons for Using Open Source Software: The main reasons respondents use open source software include its free nature, ability for secondary development, strong community atmosphere, and maintainability.
- Methods of Searching for Open Source: Most respondents search for open source products through code hosting platforms, search engines, technical communities, and technical documentation.
Open Source Contribution
- Contribution Platforms: GitHub is the most commonly used platform for open source project contributions, followed by domestic platforms like Gitee.
- Contribution Methods: Respondents mainly contribute to open source projects through code contributions, documentation-related contributions, and open source advocacy.
- Incentives: Honor-based incentives, social incentives, and career development incentives are important factors that influence respondents' open source contributions.
Community Operations Survey
- Community Roles: Respondents play various roles in the open source community, including users, participants, contributors, and maintainers.
- Communication Methods: International communication tools, domestic communication tools, and asynchronous communication tools are the primary means of communication between respondents and the open source community.
- Community Activity Level: The number of active users and developers in the communities respondents belong to varies greatly, ranging from fewer than 50 to over 500.
Domestic Open Source Development Survey
- Corporate Use of Open Source: Most companies use the community edition of open source software and have established corresponding usage requirements and management regulations.
- Open Source Education in Universities: Many universities have set up courses related to open source and support infrastructure and resource development for related projects.
- Open Source Practice Activities: Respondents actively participate in various open source practice activities, such as Google Summer of Code (GSoC) and Open Source Summer (OSPP).
- Commercialization of Open Source Projects: Most respondents agree with the idea of commercializing open source projects, showing a trend of integrating open source with business.
Key Open Source Terms in 2024 Based on the 2024 open source keyword cloud, we can summarize the key themes respondents are most concerned about in the new year:
- Technological Innovation: Keywords such as "innovation," "intelligence," and "large models" indicate that respondents are highly focused on the latest advancements in technology, especially artificial intelligence and large model development.
- Open Source Ecosystem: Words like "open source," "sharing," and "collaboration" highlight the important role the open source community plays in driving technological development and knowledge sharing. Respondents look forward to technological breakthroughs and innovations through open source projects and community collaboration.
- Security and Privacy: In the digital age, as technology applications deepen, the keyword "security" reminds us that data security and privacy protection are critical issues that cannot be ignored.
- Commercialization and Application: Keywords such as "commercialization" and "application" show respondents' focus on how to turn open source technology into practical applications and commercial value.
- Education and Talent Development: With the development of technology, keywords like "education" and "learning" suggest that continuous learning and talent development are key to adapting to future technological changes.
- Community and Cooperation: Keywords like "community" and "cooperation" emphasize the importance of building vibrant open source communities and fostering cross-field collaboration, which are essential for driving technological progress and project success.
- Change and Adaptation: In the face of an ever-changing technological environment, keywords like "change" and "development" indicate that respondents recognize adapting to and leading change are key to personal and organizational growth.
Open source keywords in 2024 |
---|
![]() |
3. Questionnaire Analysis
3.1 Characteristics of the Respondents
Through analysis of respondents’ age, gender, education, city, industry, and career identity, we can outline the basic outline of the audience involved in the open source community, which helps us understand how individuals from different backgrounds can Interact with the open source community and provide a basis for targeted community development strategies.
3.1.1 Distribution of Age, Gender, Education, Region
Age Distribution | Gender Distribution |
---|---|
![]() | ![]() |
Survey data shows that respondents are mainly concentrated in the age group of 21-30, with respondents aged 21-25 accounting for the highest proportion, reaching 26.2%, followed by 26-30, accounting for 20.5%. This shows that the audience of the open source community is dominated by young people, especially adults in the early stages of their careers, who may be more interested in new technologies and open source projects and are more willing to participate and contribute. Overall, the distribution is similar to last year.
In terms of gender distribution, male respondents accounted for the vast majority, reaching 71.0%, while female respondents accounted for 28.4%, and respondents from other genders accounted for only 0.6%. This data reflects that male participation is significantly higher than females in open source communities or related fields. But compared with last year's 25.83%, it has increased to a certain extent.
Education Level Distribution | Regional Distribution |
---|---|
![]() | ![]() |
The respondents generally received education levels of undergraduates or above, with undergraduates accounting for the highest proportion, reaching 53.2%, followed by masters, accounting for 29.0%, and doctoral students and above accounting for 6.3%. In terms of urban distribution, the proportion of respondents in the eastern coastal areas and some central areas is higher, while the proportion of respondents in the western and northern areas is relatively low. Among them, there are a large number of respondents in Beijing, Guangdong and Shanghai, part of the reason is that we collect questionnaires offline in these cities.
3.1.2 Distribution of Industry and Occupation
Industry Distribution | Occupation Distribution |
---|---|
![]() | ![]() |
Respondents were mainly distributed in the Internet/IT/electronics/communications industry, accounting for 72.23%, indicating that the respondents mainly covered the field of technology. In terms of professional identity, the most are in-school students, accounting for 36.3%, followed by back-end developers, academic researchers and open source/technical evangelist/DevRel. Overall, the survey subjects were mostly technical practitioners and students, and they covered multiple professional identities in the computer industry.
3.2 Open Source Participation Status
This section summarizes the frequency, motivation, form, and barriers to respondents’ participation in open source projects, revealing the activity and engagement of their interactions with the open source community and the factors that influence their participation.
3.2.1 Level of Participation in Open Source Communities
The role of the open source community participants | The duration of exposure to open source |
---|---|
![]() | ![]() |
The survey shows that the vast majority of members of the open source community are users (72.1%), while nearly half of the participants (55.1%) and a small number of contributors (29.5%). Compared with the 26.51% contributors last year, the proportion of those who made substantial contributions to the open source community has increased this year.
In terms of the duration of exposure to open source, 22.2% of respondents have less than 1 year of exposure to open source communities, and more than half of them have more than 3 years of experience.
We cross-analyze the respondents' role in the open source community to the question "How much do you think you are part of the open source community?"
To what extent do you think you are part of the open source community |
---|
![]() |
It can be seen that in the open source community, maintainers, contributors, and ecological operations have a more sense of belonging than participants and users.
The next question is for respondents whose roles are "users" and above in the open source community.
3.2.2 Use of Open Source Products
Reasons for choosing an open source product | Factors that affect selection |
---|---|
![]() | ![]() |
The main reason why users choose to use open source software is that the product is free, accounting for 63.3%, which reflects the importance of cost-effectiveness in open source product selection. The factor "mainly secondary development" accounts for 56.5%. "Good community atmosphere" accounts for 51.7%, indicating that a positive, friendly community environment is crucial to attracting and retaining users.
When choosing an open source product, participants pay more attention to the degree of code specification and developer activity. This shows that users are not only concerned with the functionality and quality of open source products, but also the activity of the community and developers and the sustainability of the project.
Problems encountered with open source products | Factors that drive open source contribution |
---|---|
![]() | ![]() |
More than half of the respondents encountered the problem of missing documents in the project, followed by unstable version updates.
Factors such as personal interests, community atmosphere and improving technical capabilities have played an important role in promoting open source contribution.
3.2.3 Technology Direction
Technical Directions of Interest | Open Source Licenses that You Know |
---|---|
![]() | ![]() |
Respondents showed strong interest in artificial intelligence, accounting for 73%, followed by development tools and database and data processing.
Regarding open source licenses, Apache is the most well-known open source license, followed by MIT and GPL.
3.2.4 Communication Methods
Ways to search for open source products | Ways to communicate with the community |
---|---|
![]() | ![]() |
When searching open source products, "search through code hosting platform" is the most common way of discovery, accounting for as much as 64.6%. The second is "technical community, technical media recommendation", accounting for 56.0%. The proportion of "search through search engines" is 51.0%, while "technical communication and open source code" accounts for 41.1%.
The communication methods with the open source community are mainly domestic communication tools (such as DingTalk, WeChat, QQ, Feishu, etc.) and asynchronous communication tools (such as GitHub Issue, Discussion, Mail List, etc.), while international communication tools (such as Slack, Skype , Telegram, Lark, etc.) are also widely used. This shows that international open source communities mostly focus on asynchronous communication tools, which is significantly different from the domestic ones.
Commonly used products / technology communities |
---|
![]() |
The vast majority of respondents mainly participate in the code hosting platform and open source community. In addition, nearly half of the respondents also participated in the open source community through domestic technology forums.
3.3 Open Source Contribution Status
The questions in this section are raised for respondents who have roles as "contributors" and above in the open source community. By analyzing the types and quality of respondents’ contributions to open source projects, we can evaluate their specific contributions to the community and identify potential ways to improve the efficiency and quality of contributions.
3.3.1 Participation of universities in open source
Whether to participate in open source practice activities是否参与开源实践活动 | How long does it take to participate in open source every week |
---|---|
![]() | ![]() |
Nearly one-third of student developers are actively involved in well-known open source projects such as Google Programming Summer (GSoC) and Open Source Summer (OSPP). Among them, GSoC attracted 7.4% of student developers, while OSPP attracted 28.7% of participants, and the two accounted for 36.1% of the total.
More than half of the contributors invest more than 5 hours a week on open source projects. More than 20% of contributors invest more than 10 hours a week in open source projects.
Open source education and support in colleges and universities |
---|
![]() |
21.4% of the students surveyed have colleges and universities that offer courses related to open source, and 16.7% of the students surveyed have colleges and universities that offer colleges and universities that offer colleges and universities that offer open source projects. In addition, 13.7% of the students surveyed are located in colleges and universities that support the infrastructure and resources of open source projects (such as servers, code hosting platforms, etc.).
3.3.2 Open Source Contribution Approach
Main open source contribution platform | Common development languages for open source contribution |
---|---|
![]() | ![]() |
GitHub remains the most popular platform among respondents, dominating, followed by Gitee and GitLab. This shows that among domestic developers, GitHub still has a great influence, but domestic platforms are gradually emerging. The main development languages used include Python, C/C++, Java, JavaScript, and Go. In addition, assembly language, TypeScript, etc. have also obtained a higher number of choices.
3.3.3 Open Source Contribution Types
Main contribution types | Types of projects contributed |
---|---|
![]() | ![]() |
The respondents' contribution methods to open source projects are diverse, among which "code contribution" is the main contribution method, accounting for 30.5%. The second is "document-related contributions", accounting for 24.6%, which shows that the writing and maintenance of documents are also an indispensable part of open source projects. Next are "open source sermons" accounting for 13.8%, "open source community operation" accounting for 12.8%, "assisted community activities" accounting for 10.0%, and "open source-based commercialization projects" accounting for 8.3%.
At the same time, they participate in a variety of open source projects, but mainly based on technical foundations and infrastructure.
3.3.4 Incentive Mechanism
Incentive Methods | Source of Financial Returns |
---|---|
![]() | ![]() |
The incentive mechanism of open source communities is multi-dimensional, including not only financial returns, but also career development, community recognition and personal growth. All incentive methods have been positively evaluated, indicating that diversified incentive mechanisms have had a positive impact on developers' open source participation. At the same time, although the financial returns of open source projects are diverse, most developers value non-financial incentives more.
In terms of financial returns of open source projects, most developers participate in open source projects mainly for non-financial motivation, and nearly 40% of respondents did not receive direct financial returns from open source projects.
3.4 Community Operations Survey
The questions in this section are asked to respondents whose roles are "operators" in the open source community. This section will explore the respondents' views on open source community operations, including community management, event organization, communication mechanism, etc., to understand the effectiveness and improvement space of community operations, and provide a reference for improving community operations efficiency and member satisfaction.
3.4.1 Overview of Open Source Communities
Community User Count | Active Developers |
---|---|
![]() | ![]() |
The open source community is primarily composed of small to medium-sized communities, with nearly 50% of operators' open source communities having fewer than 200 users. Among these, 21.8% of communities have fewer than 50 users, and 28.6% have 50-200 users. More than 30% of the communities have over 500 users.
3.4.2 Open Source Community Management
Community Management Status | Community Support from Commercial Companies |
---|---|
![]() | ![]() |
In terms of community management, about half of the communities have clear governance structures and dedicated personnel for daily operations, with these two factors accounting for 13.3% and 12.7%, respectively. At the same time, communities generally emphasize the development of rules and guidelines, as well as the continuous updating of documentation and resources to help new members integrate, with both factors having a proportion of 11.0%.
Regarding support from commercial companies, most open source communities receive active participation and support from commercial companies. Specifically, 10.2% of communities have commercial companies involved in co-development, 8.7% have received formal adoption by commercial companies, and 8.4% have received resources or financial sponsorship from commercial companies. However, 4.8% of communities report no support from commercial companies, and the "Other" category accounts for only 0.2%.
3.4.3 Open Source Software Commercialization Survey
Corporate Use of Open Source Software | Support for Commercializing Open Source Projects |
---|---|
![]() | ![]() |
The vast majority of enterprises tend to use open source software, with the highest proportion (43.3%) opting for the community edition. When it comes to the use of open source software, the ratio of enterprises with clear usage requirements and management regulations to those without such regulations is approximately 1:1.27. This suggests that while some enterprises pay attention to regulations and management when using open source software, a significant number still lack proper management protocols, which may be influenced by factors such as company size, industry characteristics, and their attitude towards open source software management.
Regarding the recognition of the commercialization of open source projects, the average rating is 3.5, indicating that respondents generally hold a moderately positive attitude. Specifically, the highest proportion of responses (28.9%) gave a rating of 3, followed by 27.8% who gave a rating of 5. This further indicates that, although there are some differences of opinion, most respondents are supportive of the commercialization of open source projects.
3.5 Open Source Development Survey
This section summarizes respondents' views and suggestions on the future development of open source communities, including technological trends, community development directions, and potential collaboration opportunities. The aim is to provide insights for the long-term development and strategic planning of open source communities.
3.5.1 Open Source Development Trends
Characteristics of Open Source Project Sustainability | Criteria for Evaluating Open Source Projects |
---|---|
![]() | ![]() |
Overall, respondents generally agree that the key factors for the healthy and sustainable development of open source communities include rapid community response time, continuous influx of new contributors, and the ability to effectively transform new contributors into long-term contributors. These factors account for 52.5%, 41.1%, and 31.1%, respectively.
When evaluating open source projects, respondents mainly focus on the project's influence and popularity, the activity level of the project and its community, the authority of the developers, and whether the project is regularly updated and maintained. These factors account for 61.1%, 49.4%, 41.3%, and 37.9%, respectively. These concerns reflect developers' comprehensive consideration of a project's technical strength, community engagement, and long-term maintainability.
Additionally, a good community culture and atmosphere are also crucial to community success, accounting for 28.7%, while financial support, widespread use of the project, and technological advancement of the project account for 16.8%, 16.3%, and 9.0%, respectively. While these factors are relatively minor, they are also important components of community development.
3.5.2 The Impact and Challenges of Artificial Intelligence on Developers and the Open Source Ecosystem
Types of Large Model Products Used |
---|
![]() |
Overall, closed-source models dominate the large model field due to their powerful performance and widespread application, while open-source models show their unique value and potential in specific domains and application scenarios. Among them, closed-source models like GPT series and LLaMA (Meta) show significant leading positions, with usage rates of 58.3% and 34.9%, respectively. On the other hand, open-source models also hold a place in the community. GPT series open-source implementations and Baidu's Qianfan model account for 14.6% and 13.7%, respectively, showing recognition of the open-source community's efforts in the large model field. Other open-source models like iFlytek Spark and OpenLLaMA have usage rates of 13.3% and 11.9%, and models like Zhipu AI and ChatGLM series are also popular among specific user groups, with usage rates of 11.6% and 8.9%, respectively.
Impact of Artificial Intelligence on Open Source Projects/Communities | Technical Challenges in the Development of Open Source Large Models |
---|---|
![]() | ![]() |
Artificial intelligence technology has had a profound impact on open source projects and communities. The most significant impact is the promotion of interdisciplinary collaboration, expanding open source projects in emerging fields, which accounts for 30.8%. Secondly, AI has accelerated the learning and innovation speed of developers, accounting for 20.2%. Additionally, AI has improved the efficiency of code generation and review (14.3%), automated common development tasks to reduce repetitive work (13.0%), and helped community members answer technical questions and provide guidance (6.8%). However, 4.6% of respondents worry that AI may lead to the creation of more low-quality or repetitive projects, and 4.4% are concerned that it may increase dependence on AI models, reducing developers' ability to code independently.
In the development of open-source large models, there are many technical challenges that need to be addressed. The most pressing issue is reducing the cost of model training and usage, which accounts for 53.8%, highlighting the economic barriers faced in large-scale deployment and use of AI models. Improving model transparency and interpretability is also a significant challenge, with a proportion of 39.5%, as it relates to the model's credibility and user understanding of AI decisions. Furthermore, improving the controllability and security of large models in real-world applications (34.9%) and eliminating data biases and ethical issues within the models (28.7%) are also critical. Providing more reusable open-source models and toolkits (23.2%) and enhancing the accessibility and sharing mechanisms of large models within the open-source community (14.3%) are also key factors for driving community development. Addressing these challenges will contribute to the healthy development and widespread application of open-source large models.