Oberseminar 21.04.2015

12:30h - 13:30h

Raum U151, Oet. 67


Philip Mayer: An Empirical Analysis of the Utilization of Multiple Programming Languages in Open Source Projects

Background: Anecdotal evidence suggests that software applications are usually implemented using a combination of (programming) languages.

Aim: We want to provide empirical evidence on the phenomenon of multi-language programming.

Methods: We use data mining of 1150 open source projects selected for diversity from a public repository to a) investigate the projects for number and type of languages found and the relative sizes of the languages; b) report on associations between the number of languages found and the size, age, number of contributors, and number of commits of a project using a (Quasi-)Poisson regression model, and c) discuss concrete associations between the general-purpose languages and domain-specific languages found using frequent item set mining.

Results: We found a) a mean number of 5 languages per project with a clearly dominant main general-purpose language and 5 often-used DSL types, b) a significant influence of the size, number of commits, and the main language on the number of languages as well as no significant influence of age and number of contributors, and c) three language ecosystems grouped around XML, Shell/Make, and HTML/CSS.

Conclusions: Multilanguage programming seems to be common in open-source projects and is a factor which must be dealt with in tooling and when assessing development and maintenance of such software systems.