Building Multi Tenant Java Applications
Rajesh Venkatesan Senior Architect, HCL Technologies [email protected]
Multi Tenancy An Overview
Time Share ASP End User Web Apps
What?
Ability to cater to multiple customers using a shared instance of Software/Hardware
When?
Why?
How?
Thats what this session is about
Inability of SOHO and SMB segments to adopt IT Non IT Businesses getting entrenched in managing IT
Multi Tenancy Impact in the real world
Shared Infrastructure Lower Cost Higher Complexity of Construction Dedicated Infrastructure Higher Cost Tailor Made Construction relatively easy Heavy Customization Support Non Standard Requirements Configuration over Customization Driving Standardization
Single Vs MultiTenancy
Higher Scale Lower Cost
Shared Vulnerability
Shared Upgrades
Scalability is bound to target customer size Minimized Vulnerability
Customized Upgrades
Architectural Facets of Multi Tenancy in the Software World
Virtualized Hardware Database Application Servers Inbound Outbound
Shared Infrastructure
Integration
Configuration over Customization
Standardization of UI Data Model Business Logic
Security
Data Security Application Security
Shared Infrastructure Database
Typically Multi Tenancy at the database level has 3 standard patterns Separate Database
Traditional Isolated Database Instance Per Customer
Shared Database Separate Schema
Customers get their own schema but are co-hosted in the same database
Shared Database Shared Schema
Drives the highest efficiency. All Customers data is stored in the same database and schema with a tenant id qualifier
Isolated
Shared
Isolated
Separate DB
Separate Schema
Shared Schema
Shared
Source: Multi Tenant Data Architecture, Frederick Chong, Gianpaolo Carraro, and Roger Wolter Microsoft Corporation
Database Multi Tenancy Patterns Pros and Cons
Separate Database
Easier to Maintain Allows Customization Higher Security Easy DB Upgrades High Cost to Customer
Shared Database Separate Schema
Easier to Maintain Allows Customization Relatively Higher Security Slightly Complex DB Upgrades Average Cost to Customer
Trade Off Considerations
Compliance/Regulatory Cost Operations Time to Market Liability
Shared Database Shared Schema
Lowest Cost
Complex Upgrade Process Availability impacts multiple customers Data Security delegated to application layer
Database Multi Tenancy Implementation
Isolated Database and Shared Database Separate Schema
Standard Data Access simply returns the appropriate connection based on tenant context
From a JDBC Perspective this implies different connection strings based on the customer. Typical Tenant Context is set by an intercepting filter and obtained at the DAO layer possibly via a ThreadLocal variable
Shared Database Shared Schema
Approach 1 Business Logic and Data Access is aware of multi tenant context and therefore query appropriately Pros Easy to build Cons High Probability of bugs leading to data leakage Approach 2 Abstract Multi Tenancy concern to the Data Access Layer and write business logic without tenant context. Data Access Layer automatically adds tenant context to all data calls
For Hibernate implement a Tenant aware ConnectionProvider and switch off the second level cache.
For Hibernate Use Filters Use Hibernate Shards
Integration
Typical integration concerns when applications move out of customer premises include
How can I receive notification
Is there standard integration
How can I push data to the application
How do I orchestrate my business process
Familiar? SOA?
8
Integration Contd
Fundamentally the application must support well defined interfaces for inbound Integration as well as Outbound Integration Inbound Integration Expose services Technology Independent Standards Based High Security Multi Tenant Aware
WSS4J Axis, XFire
Implementation SOAP Well Defined Standard WSS for Multi Tenant Security (Username/Token, X509 Tenant Certificate, SAML, Kerberos) REST Easy Integration Simplicity Security to be built on top.
JAX-WS
Integration Contd
Outbound Integration Allow Tenants to register for integration events. Push Vs Pull Push Synchronous Data can pushed to waiting WS endpoints Publish Standard Web Service Interfaces that customers can implement. Multi Tenant aware integration layer appropriately calls out the tenant specific interface. Problem with availability of customer endpoints Push Asynchronous Expose Secure Asynchronous Messaging Infrastructure. Heavy Vs Light Weight Events For security reasons and other reasons, push non-critical information alone into the message. The listening party then calls back via standard web service inbound interface for the actual message. Push the entire message with all relevant information. The Infrastructure is absolutely secure. The messaging infrastructure takes responsibility of ensuring delivery.
10
Security
Facets of Security
Physical Security
Security
Application Security Data Security
11
Data Security
JCA/JCE Use tenant specific encryption when required. Decouple encryption awareness from the data layer allowing data leaks to still be harmless
TradeOffs
Data at Rest
Database functions cannot be applied on encrypted fields Performance
Tokenization of Data Only a token reference is stored in the database. Actual data has to come from a high security data protection server
Data in Transit
Use Secure means of transfer (https) and add authentication/ authorization layer on top. Use In Wire Encryption for highly critical data JSSE
12
Application Security
Application Security is not different from traditional applications but some aspects become a lot more critical. Exposing the application on the web brings about a gamut of application security threats. Be Aware of possible security vulnerabilities and address them. The OWASP Top Ten Project (http://www.OWASP.org) is a good place to look. A1: Injection
A2: Cross-Site Scripting (XSS)
A3: Broken Authentication and Session Management A4: Insecure Direct Object References A5: Cross-Site Request Forgery (CSRF) A6: Security Misconfiguration
A7: Insecure Cryptographic Storage
A8: Failure to Restrict URL Access A9: Insufficient Transport Layer Protection A10: Unvalidated Redirects and Forwards
13
Application Security Contd
Some of the security best practices for applications Encrypt all communication between the browser and server via SSL. Strong password policy enforcement using configurable password policy. Passwords are stored after one way encryption in the database. It is impossible to know user passwords. Auto-Generated Passwords automatically expire after xx hours. Use of token based authentication with zero trust on server side Sessions. All access to the application is authenticated and is either secured by an authentication token or via certificates. Decoupled Authentication and Authorization and consolidation of concerns in order to establish a single point of control of user access. RBAC ensuring there are no super-users who get access to the system. Extensive Logging Capability ensuring every action is traceable to the user, request and session along with the actual change to the database. Database credentials created with named permissions. OS credentials created with named permissions All Inbound and Outbound interface points must be secured by default. (SSL) Additional Tenant Aware Security measures like Tenant Specific Certificates
14
Application Security Federated Identity
Tenant 1
Corporate LDAP
Multi Tenant Application Tenant n
Corporate LDAP
With applications moving outside of customer premise, corporate users are forced to have multiple identities one corporate and other in-cloud application identity. This poses a security problem for customers since a person moving out of the company still has access to corporate data. Therefore it becomes necessary to allow identity to be federated from the corporate context. Therefore the application has to be ready to De-Couple Identity Management and Authentication Support delegation of IdM and Authentication to corporate systems through established standards like SAML.
15
Configuration Over Customization
In order to drive efficiency, an application must standardize its features. However this results in not being able to accommodate customers with alternate business processes. This results in an architectural requirement: How to support customization via configuration?
Database
Allow extension of existing entities
Business Logic
Business Logic Templates Allow pluggable business logic. Allow small changes to business process Metadata driven UI
UI
Customize Look and Feel Layout Content
16
UI Customization
Depending on requirements UI customization is done at various depths Look and Feel The ability to change the font, color and style of existing UI Layout The ability to switch component layouts Content The ability to choose what content goes where. Two Approaches
Template/Skin Based Allow tenants to choose different themes and ability to write new themes is restricted but possible. Standard mechanism followed by most websites (Blogger, Wordpress, Liferay)
Complete Customization Allows tenants to customize the UI as per their requirements. How much they can customize is left to the application. Drag and Drop UI to Customize
Both approaches require a metadata layer that can understand the customization done be specific tenants. UI Rendering must take into account a standard layout as well as the metadata for rendering. Accommodate tenant specific UI Data models that can extensions to standard data models.
17
Business Logic & Database Customization
Business Process Customization
Enable an application to be flexible in allowing changes to business logic Allow different workflows to be configured per tenant. Reference: At the application design level Follow a highly de-coupled, pluggable component based design. Standard IoC Pattern to plug new implementations
Multi Tenant Data Architecture, Frederick Chong, Gianpaolo Carraro, and Roger Wolter Microsoft Corporation
Database Customization
Ability to extend the schema as per specific requirement In the Shared Database Separate Schema and Separate Database pattern, this becomes trivial as the customization can be done directly. In the Shared Database-Shared Schema, the following approaches are standard To have a pre-determined set of fields for specific data models that can be used as extensions. To have a generic extension schema that can accommodate customization to any entities and a data access and business logic layer that can bring in the tenant context when querying.
At the functional level Decide on the smaller variations that a business process/logic can take. Make these configurable. Allow ability to plugin newer processes as the application evolves. Accommodate generic data models during processing to cater to extended schemas Again a metadata layer is required to understand the configuration done by tenants at the business process level as well as newer business process that is available.
Spring
18
Scalability
Data
In case of a RDBMS, Shared Database Shared Schema use partitioning by tenantid (SHARD) Give a thought about NoSQL Databases if dealing with multiples of TB of data(ACID vs BASE)
Clustering Make services as stateless as possible. Session Replication is a nightmare. Avoid file system for data. Use a central datastore
Hadoop HBASE
Application Server
De-Coupled Components Conceptualize application features that can be de-coupled and scaled separately. Allows a resource hogging feature to be separated out and scale strategy planned differently. Cache data where possible (memory IS cheap) Plan for failure Auto Recovery.
With the current scope of browser capabilities (HTML5) pushing state to the browser has become easier.
UI
Also frameworks like GWT has enabled complex applications to sit on the client side. For applications using more sophisticated RIA clients (OpenLAZLO, FLEX or Silverlight), the same principle applies
19
Questions?
20