information ecosystem structures · information ecosystem (infra)structures kalle launiala, ”the...
TRANSCRIPT
Information Ecosystem (Infra)structures
Kalle Launiala, ”The Ball” / ”Caloom”
+358445575665
Structure of Presentation
• Intro: App Developer Perspective• Open Data based ”app” development• Competition / Hackathon driven approach – towards production
• Intro: Open Data Provider Perspective• Reasons to open data/interfaces, positioning on the ecosystem
• Intro: Support Infrastructure Perspective• Developer & database infrastructure, datacenter, cloud providers
• Combine: Service Infrastructure ”Development” Perspective• Identify the possibilities to accelerate the above parties
Naming concrete parties
• Some infrastructure and service providers are named to clarify concrete example
• This is not necessarily due to current partnerships
• Every single named party can be replaced by other provider of the same roles• New parties are encouraged to just join in and start
providing their services – to replace or add in value
App DeveloperWhat are key drivers; hacking & testing, swift development. Production and maintenance reality of active apps.
App Developer: Full solution stack/task list1. Identify open data provider(s) to use
2. Optional: Identify existing reusable data or software library/blocks
3. Study ”how-to” of 1 and 2; SDK/API, data format to use
4. Optional: Index the combination of data – might require full open data export to developer ”own” database
5. Implement UI ”App”; web app, mobile app – something with user interface
6. Optional: Combine data source with user-specific data – insensitive such as favorites, or very sensitive such as real-time location or private calendar
7. Optional: From ”hacking” to production grade
8. Optional: Store reusable parts for self, or share with community
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Open Data ProviderReasons to provide the data; to serve the end-customer through developers
Data Provider: Bringing data ”easily available”1. Identify relevant raw data
2. Identify required refined and indexed format
3. Provide resources to process from Raw Data => Open Data
4. Provide resources to store Open Data sources
5. Provide resources to serve Open Data sources
6. Provide”How-To” documentation and maintain it up-to-date
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
How-To Documentationabout the usage;
Including SDK/API andData format usage, examples
Support InfrastructureProviding the required infrastructure to run dev/test/production apps and services on
Bigger Picture: Infrastructure = CPU + Storage + Network Traffic
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Infrastructure Resource Factors1. Amount of data to store
2. Amount of data to process; how often to refresh
3. Optional: Dynamic queries or flat-served raw data
4. Amount of data to serve
5. Combined data to store
6. Combined data to process (refining & re-indexing)
7. Application logic & back-end processing
8. Application network usage
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Infrastructure Costs = Cloud Pricing Units• Public cloud platform pricing is not business case driven – it’s actually
transparent to cost structure• Windows Azure, Amazon EC2, Google AppEngine, ...
• CPU = Virtual Machine Reservation
• Storage = Storage Infrastructure + Transaction Cost
• Network = Network Infrastructure
• CPU: Consolidated use = Cheap, full load = Expensive
• Inbound network traffic: Underused = Free
• Outbound network traffic: Constrained = Expensive
• Storage: Redundancy & Scalability factored = Cheap
• Datacenter internal traffic: Scalability enabling element = Free
Service Infrastructure ”Development”Addressing common needs between support infrastructure, open data providers and actual app developers
Identify common parts to accelerate, so that this...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Raw Data Source A
Open Data Source X
Open Data Source Y
Open Data Source Z
Raw Data Source B
Data Refining, Processing,
Reformatting,Indexing...
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
App Developer ”Communizable” PartsOpen Data Usage
1. Identify common combined data sources
2. Unify the ways to combine & index data sources (HINT: consider tht Open Data Providers also unify on this)
3. Share the combination alike any other data source
Private Data Usage
1. Identify use-specific privacy storage needs
2. Unify the ways to manage private data
3. Store app data still separate, but with same structure
Open Data Source A
Open Data Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Combination ofSource A & Source B
ApplicationCombined/Refined
Data
ApplicationUser Specific
Data
Web App Mobile App
Application Business Logic &Back-End Server
Single App Perspective1. Unified data source
publishing = unified ”How-To” documentation & examples
2. Consolidated indexing and format processing = separated responsibility from every app developer
3. Unified, transparent management of private data
Raw Data Source(s)
Open Data Source(s)
Data Refining, Processing,
Reformatting,Indexing...
How-To Documentationabout the usage;
Including SDK/API andData format usage, examples
ApplicationPublic, Structured
Open Data
ApplicationUser SpecificPrivate Data
Application Business Logic &Back-End Server
Web App Mobile App
Roles and reusables recognized
• App developers are the key workforce• They are the critical resource – that don’t need any other parties to still provide the ”apps”
• End user needs are the key motivation• Open data providers and infrastructure providers essentially aim to sustainable value / growth• End user needs are the value that is monetizable• End user privacy concern is essential to understand• Unifying for ”half-assed” solution that fails to execute with personal location bound smart traffic or
medical data will result in ”crappy one-shot apps”
• Accelerating app developers enables all the others• Communicating ”why to unify” is a challenge• Acceleration must be incremental benefit to motivate experts• Acceleration unifying will enable novice devs as well
• Reusability = Unified model of operation, where applicable• Benefits every role, when done properly from every role’s unique perspective• Lowers the overhead for guidance, enable roles to be self-sufficient and self-evolving• Not to over-unify across role boundaries = each role should have clear objectives in the big picture
... Becomes like this
Raw Data Source(s)
Open Data Source(s)
Data Refining, Processing,
Reformatting,Indexing...
ApplicationStructuredOpen Data
ApplicationUser SpecificPrivate Data
Application Business Logic &Back-End Servers
Web App sMobile App s
Raw Data Source(s)
Raw Data Source(s)
Raw Data Source(s)
Raw Data Source(s)
Raw Data Source(s)
Open Data Source(s)
Open Data Source(s)
Open Data Source(s)
ApplicationStructuredOpen Data
ApplicationStructuredOpen Data
ApplicationStructuredOpen Data
ApplicationUser SpecificPrivate Data
ApplicationUser SpecificPrivate Data
ApplicationUser SpecificPrivate Data
The End User
Infrastructure Enablers
• Storage = Hosting/Datacenter/Cloud Providers
• Computing = Hosting/Datacenter/Cloud Providers
• Scalability = Hosting/Datacenter/Cloud Providers
• App developer team = Version Control + Team Management
Reusable Digital Artifacts
• Data processing = version controllable source code• SQL statements, custom code, export/import scripts
• Data storage = SQL & NoSQL data storage• SQL database servers, measured & communized• Graph databases, custom indexing = NoSQL storages
• Application components = version controllable• Unified libraries• Controlled private data management – shared
repository
Identified Business Opportunities
• Cloud Provider Opportunity• Open for any ISV to use = Favor Public providers• Massive data amounts, storage & network = Favor massive clouds• Network performance & outbound traffic cost
• FAVOR SINGLE DATACENTER FOR WHOLE ECOSYSTEM• First mover advantage for pilot / critical mass• Subsidizing raw data providers may be an option to profit for app
developers / ISVs to pay for their own usage• Case for Microsoft Azure, Amazon EC2, Google AppEngine and alike
• Version Control & Team Control Opportunity• Unified management artifacts• Central library repository for shared app components• Including earning models for paid subscriptions for private
repositories• Case for GitHub, Gitorious and alike