SlideShare a Scribd company logo
Software Ecosystems as
Networks: the FASTEN project
Paolo Boldi
Università degli Studi di
Milano
Italy
Programmers and sharing
Programmers and sharing
❖ The fundamental act of
friendship among
programmers is the sharing of
programs
The GNU manifesto (1985)
Sharing through software libraries
Sharing through software libraries
❖ One form of sharing is providing libraries
Sharing through software libraries
❖ One form of sharing is providing libraries
❖ Today, libraries are made available in the Internet
Sharing through software libraries
❖ One form of sharing is providing libraries
❖ Today, libraries are made available in the Internet
❖ on forges (SourceForge, GitHub, BitBucket, …)
Sharing through software libraries
❖ One form of sharing is providing libraries
❖ Today, libraries are made available in the Internet
❖ on forges (SourceForge, GitHub, BitBucket, …)
❖ or repositories (Maven, PyPi, CPAN, …)
Sharing through software libraries
❖ One form of sharing is providing libraries
❖ Today, libraries are made available in the Internet
❖ on forges (SourceForge, GitHub, BitBucket, …)
❖ or repositories (Maven, PyPi, CPAN, …)
❖ Internet made the dream of collaborative development a
reality
Industrial revolution
at the harbour of software development
Industrial revolution
at the harbour of software development
❖ All trades, arts, and handiworks have gained by
division of labour, namely, when, instead of one
man doing everything, each confines himself to a
certain kind of work distinct from others in the
treatment it requires, so as to be able to perform it
with greater facility and in the greatest
perfection. Where the different kinds of work are
not distinguished and divided, where everyone is
a jack-of-all-trades, there manufactures remain
still in the greatest barbarism.
Immanuel Kant
Groundwork for the Metaphysics
of Morals (1785)
With libraries come dependencies…
With libraries come dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
With libraries come dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
Library
With libraries come dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
Library
Dependencies
With libraries come dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
Library
Dependencies
+version
With libraries come dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
Library
Dependencies
+version
+version constraints
…and transitive dependencies…
…and transitive dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
…and transitive dependencies…
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>foo-bar</groupId>
<artifactId>NeoImport</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.26</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
</project>
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>it.unimi.dsi</groupId>
<artifactId>webgraph</artifactId>
<version>3.6.1</version>
</dependency>
<dependency>
<groupId>org.eclipse.jgit</groupId>
<artifactId>org.eclipse.jgit</artifactId>
<version>5.2.1.201812262042-r</version>
</dependency>
</dependencies>
</project>
Depends
Dependency graphs
Dependency graphs
❖ Library+versions and their
dependencies form (complex,
huge) dependency networks
Dependency graphs
❖ Library+versions and their
dependencies form (complex,
huge) dependency networks
❖ Version constraints make these
networks more complicated
than simple graphs
Dependency graphs
❖ Library+versions and their
dependencies form (complex,
huge) dependency networks
❖ Version constraints make these
networks more complicated
than simple graphs
❖ Package manager will finally
determine which version is
chosen for each library
The dependency heaven
The dependency heaven
❖ Relying on an
ecosystem of easy-to-
use well written
libraries made the
dream of code reuse a
reality
The dependency hell
The dependency hell
❖ A bug or security
breach or legal issue
concerning one single
piece…
❖ …can make the whole
tower fall!
Recent dependency nightmares
Recent dependency nightmares
❖ The leftpad incident (2016): millions of websites
affected
Recent dependency nightmares
❖ The leftpad incident (2016): millions of websites
affected
❖ The Equifax breach (2017): costed 4B$
Epidemics in dependency graphs
Lib A, vers 1.0
Lib B, vers 2.5
Lib C, vers 1.5Lib D, vers 3.0
Epidemics in dependency graphs
Lib A, vers 1.0
Lib B, vers 2.5
Lib C, vers 1.5Lib D, vers 3.0
A vulnerability alert
is issued
about Lib D, vers 3.0
Epidemics in dependency graphs
Lib A, vers 1.0
Lib B, vers 2.5
Lib C, vers 1.5Lib D, vers 3.0
A vulnerability alert
is issued
about Lib D, vers 3.0
All libraries in this
graph are infected!
GitHub security alerts
But is this enough?
The FASTEN Project
❖ Fine-Grained Analysis of SofTware Ecosystems as Networks
❖ Part of the EU H2020-ICT-2018-2020 Program
❖ Consortium
Epidemics in dependency graphs
Lib A, vers 1.0
Lib B, vers 2.5
Lib C, vers 1.5Lib D, vers 3.0
Epidemics in dependency graphs
A.f0
A.f2
A.f3
B.f1
B.f2
B.f3
C.f1
C.f2
D.f1
D.f2
D.f3
Epidemics in dependency graphs
A.f0
A.f2
A.f3
B.f1
B.f2
B.f3
C.f1
C.f2
D.f1
D.f2
D.f3
A vulnerability alert
is issued
about Lib D, vers 3.0,
function f3
Epidemics in dependency graphs
A.f0
A.f2
A.f3
B.f1
B.f2
B.f3
C.f1
C.f2
D.f1
D.f2
D.f3
A vulnerability alert
is issued
about Lib D, vers 3.0,
function f3
Epidemics in dependency graphs
A.f0
A.f2
A.f3
B.f1
B.f2
B.f3
C.f1
C.f2
D.f1
D.f2
D.f3
A vulnerability alert
is issued
about Lib D, vers 3.0,
function f3
Much more informative!
Examples
Examples
❖ Fully precise change impact analysis: “How many libraries
are affected if I remove/modify a certain method/interface?”
Examples
❖ Fully precise change impact analysis: “How many libraries
are affected if I remove/modify a certain method/interface?”
❖ Fully precise license compliance: “Is my library compliant
with the licenses of the libraries that I depend from (directly or
indirectly)? (e.g., am I linking any GPL code?)”
Examples
❖ Fully precise change impact analysis: “How many libraries
are affected if I remove/modify a certain method/interface?”
❖ Fully precise license compliance: “Is my library compliant
with the licenses of the libraries that I depend from (directly or
indirectly)? (e.g., am I linking any GPL code?)”
❖ Fully precise risk profiling: “Does this vulnerability affect my
code?”
Examples
❖ Fully precise change impact analysis: “How many libraries
are affected if I remove/modify a certain method/interface?”
❖ Fully precise license compliance: “Is my library compliant
with the licenses of the libraries that I depend from (directly or
indirectly)? (e.g., am I linking any GPL code?)”
❖ Fully precise risk profiling: “Does this vulnerability affect my
code?”
❖ Centrality analysis: “What methods/functions are more central
within a given ecosystem? are there bottlenecks? critical points?”
The FASTEN toolchain
The FASTEN toolchain
Project information
Security
alerts
Repositories
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
RESTApi
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
RESTApiWebUI
publish
publish
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
RESTApiWebUI
publish
publish
Continuous
integration server
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
RESTApiWebUI
publish
publish
Continuous
integration server
The FASTEN toolchain
Project information
Security
alerts
Repositories
publish
Data stream
FASTEN
server
Call-graph
construction
Storage
layer
Analysis
layer
RESTApiWebUI
publish
publish
Continuous
integration server
Developer
Updating a library
Updating a library
# Check outdated dependencies
$ pip list --outdated
Package Version Latest Type
---------- ------- ------ -----
Pygments 2.2.0 2.3.1 wheel
Before
Updating a library
# Check outdated dependencies
$ pip list --outdated
Package Version Latest Type
---------- ------- ------ -----
Pygments 2.2.0 2.3.1 wheel
Before # Check outdated dependencies
$ pip list --outdated
Package Version Latest Type
---------- ------- ------ -----
Pygments 2.2.0 2.3.1 wheel
Updating Pygments will affect:
foo.py: function colorize
bar.py: function parse
After
Updating a library
# Check outdated dependencies
$ pip list --outdated
Package Version Latest Type
---------- ------- ------ -----
Pygments 2.2.0 2.3.1 wheel
Before # Check outdated dependencies
$ pip list --outdated
Package Version Latest Type
---------- ------- ------ -----
Pygments 2.2.0 2.3.1 wheel
Updating Pygments will affect:
foo.py: function colorize
bar.py: function parse
After
# Estimate update impact
$ pip install --dry-run Pygments
Function Pygments.Formatter.format[formatter.py]
changed ->
check <your_app> at colorize[foo.py]:32
Information about a library
Information about a library
# Checking info about the library
$ pip show tornado
Name: tornado
Version: 5.0
Summary: Tornado is a Python web framework ...
Home-page: http://www.tornadoweb.org/
Author: Facebook
Author-email: ...
License: http://www.apache.org/licenses/
LICENSE-2.0
Location: ...
Requires: backports-abc, futures,
singledispatch
Required-by:
Before
Information about a library
# Checking info about the library
$ pip show tornado
Name: tornado
Version: 5.0
Summary: Tornado is a Python web framework ...
Home-page: http://www.tornadoweb.org/
Author: Facebook
Author-email: ...
License: http://www.apache.org/licenses/
LICENSE-2.0
Location: ...
Requires: backports-abc, futures,
singledispatch
Required-by:
Before
# Checking info about the library
$ pip show tornado
Name: tornado
Version: 5.0
License: http://www.apache.org/licenses/
LICENSE-2.0
...
Maintainers: 3
Community size: 15
Used by: 145 on PyPI, 34433 on GitHub
Latest vulnerability: 13 months ago
All known vulnerabilities: 25 (best 10%)
License rating: Compatible
After
Who uses my function?
Who uses my function?
# Check uses of function pkg.list() in dependents
$ pip query --uses pkg.list
depA(v1.2).parse()
depA(v1.2).test()
depB(0.0.2).foo()
depC(1.2.1).calculate()X
Who uses my function?
# Check uses of function pkg.list() in dependents
$ pip query --uses pkg.list
depA(v1.2).parse()
depA(v1.2).test()
depB(0.0.2).foo()
depC(1.2.1).calculate()X
# Rank them by importance (aka centrality)
$ pip query --uses --rank pkg.list
depC(1.2.1).calculate()
depB(0.0.2).foo()
depA(v1.2).parse()
depA(v1.2).test()
Who uses my function?
# Check uses of function pkg.list() in dependents
$ pip query --uses pkg.list
depA(v1.2).parse()
depA(v1.2).test()
depB(0.0.2).foo()
depC(1.2.1).calculate()X
# Estimate "damage" if pkg.list will be updated
$ pip query —total pkg.list
3 direct and 223 indirect dependencies will be affected)
# Rank them by importance (aka centrality)
$ pip query --uses --rank pkg.list
depC(1.2.1).calculate()
depB(0.0.2).foo()
depA(v1.2).parse()
depA(v1.2).test()
Who uses my function?
# Check uses of function pkg.list() in dependents
$ pip query --uses pkg.list
depA(v1.2).parse()
depA(v1.2).test()
depB(0.0.2).foo()
depC(1.2.1).calculate()X
# Estimate "damage" if pkg.list will be updated
$ pip query —total pkg.list
3 direct and 223 indirect dependencies will be affected)
# Notify direct dependencies of upcoming breakage
$ pip query --uses pkg.list |
cut -f 1 -d '(' |
xargs -I {} pip show {} |
grep Author-email: | cut -f 2 -d ':' |
xargs mail -s 'MyProject update will break yours!'
# Rank them by importance (aka centrality)
$ pip query --uses --rank pkg.list
depC(1.2.1).calculate()
depB(0.0.2).foo()
depA(v1.2).parse()
depA(v1.2).test()
FASTEN challenges
FASTEN challenges
❖ Sound call-graph generation
FASTEN challenges
❖ Sound call-graph generation
❖ Scalability (huge graphs!)
FASTEN challenges
❖ Sound call-graph generation
❖ Scalability (huge graphs!)
❖ Real-time responsiveness
FASTEN challenges
❖ Sound call-graph generation
❖ Scalability (huge graphs!)
❖ Real-time responsiveness
❖ Data integration
FASTEN challenges
❖ Sound call-graph generation
❖ Scalability (huge graphs!)
❖ Real-time responsiveness
❖ Data integration
❖ Tool development for end users
Network analysis will be the next
step for the future of
software development
Network analysis will be the next
step for the future of
software development
Questions?
Paolo Boldi
Università degli Studi di
Milano
Italy
paolo.boldi@unimi.it

More Related Content

Presentation of the FASTEN project, Conference SFScon, Bolzano, Italy

  • 1. Software Ecosystems as Networks: the FASTEN project Paolo Boldi Università degli Studi di Milano Italy
  • 3. Programmers and sharing ❖ The fundamental act of friendship among programmers is the sharing of programs The GNU manifesto (1985)
  • 5. Sharing through software libraries ❖ One form of sharing is providing libraries
  • 6. Sharing through software libraries ❖ One form of sharing is providing libraries ❖ Today, libraries are made available in the Internet
  • 7. Sharing through software libraries ❖ One form of sharing is providing libraries ❖ Today, libraries are made available in the Internet ❖ on forges (SourceForge, GitHub, BitBucket, …)
  • 8. Sharing through software libraries ❖ One form of sharing is providing libraries ❖ Today, libraries are made available in the Internet ❖ on forges (SourceForge, GitHub, BitBucket, …) ❖ or repositories (Maven, PyPi, CPAN, …)
  • 9. Sharing through software libraries ❖ One form of sharing is providing libraries ❖ Today, libraries are made available in the Internet ❖ on forges (SourceForge, GitHub, BitBucket, …) ❖ or repositories (Maven, PyPi, CPAN, …) ❖ Internet made the dream of collaborative development a reality
  • 10. Industrial revolution at the harbour of software development
  • 11. Industrial revolution at the harbour of software development ❖ All trades, arts, and handiworks have gained by division of labour, namely, when, instead of one man doing everything, each confines himself to a certain kind of work distinct from others in the treatment it requires, so as to be able to perform it with greater facility and in the greatest perfection. Where the different kinds of work are not distinguished and divided, where everyone is a jack-of-all-trades, there manufactures remain still in the greatest barbarism. Immanuel Kant Groundwork for the Metaphysics of Morals (1785)
  • 12. With libraries come dependencies…
  • 13. With libraries come dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project>
  • 14. With libraries come dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project> Library
  • 15. With libraries come dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project> Library Dependencies
  • 16. With libraries come dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project> Library Dependencies +version
  • 17. With libraries come dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project> Library Dependencies +version +version constraints
  • 19. …and transitive dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project>
  • 20. …and transitive dependencies… <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>foo-bar</groupId> <artifactId>NeoImport</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> <version>1.0.3</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.26</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency> </dependencies> </project> <project xmlns="http://maven.apache.org/POM/4.0.0"> <modelVersion>4.0.0</modelVersion> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>it.unimi.dsi</groupId> <artifactId>webgraph</artifactId> <version>3.6.1</version> </dependency> <dependency> <groupId>org.eclipse.jgit</groupId> <artifactId>org.eclipse.jgit</artifactId> <version>5.2.1.201812262042-r</version> </dependency> </dependencies> </project> Depends
  • 22. Dependency graphs ❖ Library+versions and their dependencies form (complex, huge) dependency networks
  • 23. Dependency graphs ❖ Library+versions and their dependencies form (complex, huge) dependency networks ❖ Version constraints make these networks more complicated than simple graphs
  • 24. Dependency graphs ❖ Library+versions and their dependencies form (complex, huge) dependency networks ❖ Version constraints make these networks more complicated than simple graphs ❖ Package manager will finally determine which version is chosen for each library
  • 26. The dependency heaven ❖ Relying on an ecosystem of easy-to- use well written libraries made the dream of code reuse a reality
  • 28. The dependency hell ❖ A bug or security breach or legal issue concerning one single piece… ❖ …can make the whole tower fall!
  • 30. Recent dependency nightmares ❖ The leftpad incident (2016): millions of websites affected
  • 31. Recent dependency nightmares ❖ The leftpad incident (2016): millions of websites affected ❖ The Equifax breach (2017): costed 4B$
  • 32. Epidemics in dependency graphs Lib A, vers 1.0 Lib B, vers 2.5 Lib C, vers 1.5Lib D, vers 3.0
  • 33. Epidemics in dependency graphs Lib A, vers 1.0 Lib B, vers 2.5 Lib C, vers 1.5Lib D, vers 3.0 A vulnerability alert is issued about Lib D, vers 3.0
  • 34. Epidemics in dependency graphs Lib A, vers 1.0 Lib B, vers 2.5 Lib C, vers 1.5Lib D, vers 3.0 A vulnerability alert is issued about Lib D, vers 3.0 All libraries in this graph are infected!
  • 35. GitHub security alerts But is this enough?
  • 36. The FASTEN Project ❖ Fine-Grained Analysis of SofTware Ecosystems as Networks ❖ Part of the EU H2020-ICT-2018-2020 Program ❖ Consortium
  • 37. Epidemics in dependency graphs Lib A, vers 1.0 Lib B, vers 2.5 Lib C, vers 1.5Lib D, vers 3.0
  • 38. Epidemics in dependency graphs A.f0 A.f2 A.f3 B.f1 B.f2 B.f3 C.f1 C.f2 D.f1 D.f2 D.f3
  • 39. Epidemics in dependency graphs A.f0 A.f2 A.f3 B.f1 B.f2 B.f3 C.f1 C.f2 D.f1 D.f2 D.f3 A vulnerability alert is issued about Lib D, vers 3.0, function f3
  • 40. Epidemics in dependency graphs A.f0 A.f2 A.f3 B.f1 B.f2 B.f3 C.f1 C.f2 D.f1 D.f2 D.f3 A vulnerability alert is issued about Lib D, vers 3.0, function f3
  • 41. Epidemics in dependency graphs A.f0 A.f2 A.f3 B.f1 B.f2 B.f3 C.f1 C.f2 D.f1 D.f2 D.f3 A vulnerability alert is issued about Lib D, vers 3.0, function f3 Much more informative!
  • 43. Examples ❖ Fully precise change impact analysis: “How many libraries are affected if I remove/modify a certain method/interface?”
  • 44. Examples ❖ Fully precise change impact analysis: “How many libraries are affected if I remove/modify a certain method/interface?” ❖ Fully precise license compliance: “Is my library compliant with the licenses of the libraries that I depend from (directly or indirectly)? (e.g., am I linking any GPL code?)”
  • 45. Examples ❖ Fully precise change impact analysis: “How many libraries are affected if I remove/modify a certain method/interface?” ❖ Fully precise license compliance: “Is my library compliant with the licenses of the libraries that I depend from (directly or indirectly)? (e.g., am I linking any GPL code?)” ❖ Fully precise risk profiling: “Does this vulnerability affect my code?”
  • 46. Examples ❖ Fully precise change impact analysis: “How many libraries are affected if I remove/modify a certain method/interface?” ❖ Fully precise license compliance: “Is my library compliant with the licenses of the libraries that I depend from (directly or indirectly)? (e.g., am I linking any GPL code?)” ❖ Fully precise risk profiling: “Does this vulnerability affect my code?” ❖ Centrality analysis: “What methods/functions are more central within a given ecosystem? are there bottlenecks? critical points?”
  • 48. The FASTEN toolchain Project information Security alerts Repositories
  • 49. The FASTEN toolchain Project information Security alerts Repositories publish Data stream publish publish
  • 50. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server publish publish
  • 51. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction publish publish
  • 52. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer publish publish
  • 53. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer publish publish
  • 54. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer RESTApi publish publish
  • 55. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer RESTApiWebUI publish publish
  • 56. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer RESTApiWebUI publish publish Continuous integration server
  • 57. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer RESTApiWebUI publish publish Continuous integration server
  • 58. The FASTEN toolchain Project information Security alerts Repositories publish Data stream FASTEN server Call-graph construction Storage layer Analysis layer RESTApiWebUI publish publish Continuous integration server Developer
  • 60. Updating a library # Check outdated dependencies $ pip list --outdated Package Version Latest Type ---------- ------- ------ ----- Pygments 2.2.0 2.3.1 wheel Before
  • 61. Updating a library # Check outdated dependencies $ pip list --outdated Package Version Latest Type ---------- ------- ------ ----- Pygments 2.2.0 2.3.1 wheel Before # Check outdated dependencies $ pip list --outdated Package Version Latest Type ---------- ------- ------ ----- Pygments 2.2.0 2.3.1 wheel Updating Pygments will affect: foo.py: function colorize bar.py: function parse After
  • 62. Updating a library # Check outdated dependencies $ pip list --outdated Package Version Latest Type ---------- ------- ------ ----- Pygments 2.2.0 2.3.1 wheel Before # Check outdated dependencies $ pip list --outdated Package Version Latest Type ---------- ------- ------ ----- Pygments 2.2.0 2.3.1 wheel Updating Pygments will affect: foo.py: function colorize bar.py: function parse After # Estimate update impact $ pip install --dry-run Pygments Function Pygments.Formatter.format[formatter.py] changed -> check <your_app> at colorize[foo.py]:32
  • 64. Information about a library # Checking info about the library $ pip show tornado Name: tornado Version: 5.0 Summary: Tornado is a Python web framework ... Home-page: http://www.tornadoweb.org/ Author: Facebook Author-email: ... License: http://www.apache.org/licenses/ LICENSE-2.0 Location: ... Requires: backports-abc, futures, singledispatch Required-by: Before
  • 65. Information about a library # Checking info about the library $ pip show tornado Name: tornado Version: 5.0 Summary: Tornado is a Python web framework ... Home-page: http://www.tornadoweb.org/ Author: Facebook Author-email: ... License: http://www.apache.org/licenses/ LICENSE-2.0 Location: ... Requires: backports-abc, futures, singledispatch Required-by: Before # Checking info about the library $ pip show tornado Name: tornado Version: 5.0 License: http://www.apache.org/licenses/ LICENSE-2.0 ... Maintainers: 3 Community size: 15 Used by: 145 on PyPI, 34433 on GitHub Latest vulnerability: 13 months ago All known vulnerabilities: 25 (best 10%) License rating: Compatible After
  • 66. Who uses my function?
  • 67. Who uses my function? # Check uses of function pkg.list() in dependents $ pip query --uses pkg.list depA(v1.2).parse() depA(v1.2).test() depB(0.0.2).foo() depC(1.2.1).calculate()X
  • 68. Who uses my function? # Check uses of function pkg.list() in dependents $ pip query --uses pkg.list depA(v1.2).parse() depA(v1.2).test() depB(0.0.2).foo() depC(1.2.1).calculate()X # Rank them by importance (aka centrality) $ pip query --uses --rank pkg.list depC(1.2.1).calculate() depB(0.0.2).foo() depA(v1.2).parse() depA(v1.2).test()
  • 69. Who uses my function? # Check uses of function pkg.list() in dependents $ pip query --uses pkg.list depA(v1.2).parse() depA(v1.2).test() depB(0.0.2).foo() depC(1.2.1).calculate()X # Estimate "damage" if pkg.list will be updated $ pip query —total pkg.list 3 direct and 223 indirect dependencies will be affected) # Rank them by importance (aka centrality) $ pip query --uses --rank pkg.list depC(1.2.1).calculate() depB(0.0.2).foo() depA(v1.2).parse() depA(v1.2).test()
  • 70. Who uses my function? # Check uses of function pkg.list() in dependents $ pip query --uses pkg.list depA(v1.2).parse() depA(v1.2).test() depB(0.0.2).foo() depC(1.2.1).calculate()X # Estimate "damage" if pkg.list will be updated $ pip query —total pkg.list 3 direct and 223 indirect dependencies will be affected) # Notify direct dependencies of upcoming breakage $ pip query --uses pkg.list | cut -f 1 -d '(' | xargs -I {} pip show {} | grep Author-email: | cut -f 2 -d ':' | xargs mail -s 'MyProject update will break yours!' # Rank them by importance (aka centrality) $ pip query --uses --rank pkg.list depC(1.2.1).calculate() depB(0.0.2).foo() depA(v1.2).parse() depA(v1.2).test()
  • 72. FASTEN challenges ❖ Sound call-graph generation
  • 73. FASTEN challenges ❖ Sound call-graph generation ❖ Scalability (huge graphs!)
  • 74. FASTEN challenges ❖ Sound call-graph generation ❖ Scalability (huge graphs!) ❖ Real-time responsiveness
  • 75. FASTEN challenges ❖ Sound call-graph generation ❖ Scalability (huge graphs!) ❖ Real-time responsiveness ❖ Data integration
  • 76. FASTEN challenges ❖ Sound call-graph generation ❖ Scalability (huge graphs!) ❖ Real-time responsiveness ❖ Data integration ❖ Tool development for end users
  • 77. Network analysis will be the next step for the future of software development
  • 78. Network analysis will be the next step for the future of software development
  • 79. Questions? Paolo Boldi Università degli Studi di Milano Italy paolo.boldi@unimi.it