foundation apis and repository internals

19
Alfresco Repository Internals 1 Derek Hulley Senior Engineer, Alfresco twitter: @derekhulley

Upload: alfresco-software

Post on 28-Nov-2014

1.613 views

Category:

Documents


3 download

DESCRIPTION

In this session we will start by examining some of features that developers have available when using the Foundation service interfaces: how to initiate and use transactions; how and when to make use of transactional resources; using different types of search; controlling behaviors e.g. 'cm:auditable'; changing CopyService behavior. Following this, some repository internals will be examined, including: typical content lifecycles and parameters that control these; schema generation files, upgrade scripts and runtime SQL (3.4-specific); considerations for large-scale custom data structures.

TRANSCRIPT

Page 1: Foundation APIs and Repository Internals

1

Alfresco Repository Internals

Derek HulleySenior Engineer, Alfresco

twitter: @derekhulley

Page 2: Foundation APIs and Repository Internals

2

Alfresco Repository Internals

1. Resource Contention1. Node Creation and Modification

2. Actions

3. Scheduled Jobs

2. Transactions1. Resources

2. Implicit and Explicit

3. Using Alfresco’s Transaction Support

Agenda (1)

Page 3: Foundation APIs and Repository Internals

3

Alfresco Repository Internals

3. Navigating Hierarchies1. Lucene-based APIs

2. NodeService-based APIs

3. Walking Child Associations

4. Policies and Behaviour1. Policy Behaviour Filters

2. CopyService

Agenda (2)

Page 4: Foundation APIs and Repository Internals

4

Alfresco Repository Internals

5. Content Lifecycles1. ContentData Properties

2. Binary Files and Transactions

3. Orphaned Content

4. System Properties

6. Application Bootstrap5. Spring init

6. Lifecycle Classes

7. Modules

7. Questions

Agenda (3)

Page 5: Foundation APIs and Repository Internals

5

Resource ContentionNode Creation

• Row inserts• Read-committed isolation: Invisible until commit• Database, Caches, Lucene Indexes: Low contention

Page 6: Foundation APIs and Repository Internals

6

Resource ContentionNode Modification

• Update type, aspects, properties, ACLs; Move; Delete; etc• Invisible until commit, but can hold resource locks• Transactions rejected e.g. ConcurrencyFailureException

Page 7: Foundation APIs and Repository Internals

7

Resource Contention

Actions and Scheduled Jobs• Danger of background jobs moving ‘up’ a hierarchy

L1N1

L2N1 L2N2

L3N1 L3N1

Only one winner (at a time)

?

• Individual node modifications are serialized• Pick up small junks, commit and give way

Page 8: Foundation APIs and Repository Internals

8

Transaction

Transactions: Resources

Connection Pool

Content Binaries

Lucene Indexes

Database Rows

Thread Pools

Caches

• Each transaction requires a thread• Possibly multiple transactions on a thread

• Database row locking• ‘version’ column: optimistic locking

• Index deltas• Heavy on IO

• One transaction – one connection• Connection housekeeping has a cost

• New content binaries only• Temporary files

• Caches are transaction-aware• Conflicts drop cache entries

It pays to think about resource contention.

Replaying transactions means:• Reclaiming resources• CPU cycles• Lower response times

Page 9: Foundation APIs and Repository Internals

9

Transactions

Implicit• Defined against public service (Foundation) APIs• Bean naming convention: NodeService vs nodeService• Cost is in starting a transaction, not continuing one• public-services-context.xml and ServiceRegistry• Spring customization and interceptors

Page 10: Foundation APIs and Repository Internals

10

Transactions

Explicit• Wrap all atomic operations including groups of reads• Use RetryingTransactionHelper rather than UserTransaction

Page 11: Foundation APIs and Repository Internals

11

Transactions

Explicit: Demo: Read-only Batching

Get Stores

Get Children

Lucene Query

• Time lost to transaction initiation

• 3ms lost per low-level operation ... How much per user click?

Page 12: Foundation APIs and Repository Internals

12

Transactions

Explicit: Demo: Write Batching

• Time lost to initiate transactions. Well, yes, but ...

• Unnecessary additional indexing

index

index

CIFS and FTP

Create Node

Add Content

Get Writer

Page 13: Foundation APIs and Repository Internals

13

1

Transactions

Direct Contention: Demo

0 0 1 1 1 1 1 0 0 0

begin

log4j.logger.org.alfresco.repo.transaction.RetryingTransactionHelper=warnEg: Retrying OptimisticLockingDemo-8: count 0; wait: 0.1s; msg: "Failed to update node 14575"; exception: ...ConcurrencyFailureException

Page 14: Foundation APIs and Repository Internals

14

TransactionsAlfresco Transaction Support• RetryingTransactionHelper

• Reliable, resolves contention• “non propagating”: Transaction suspended and a new one started.• TransactionService.getRetryingTransactionHelper -> your instance

• TransactionalListener and TransactionListenerAdapter• Bind to events associated with a transaction

• AlfrescoTransactionSupport• Helper around Spring’s TransactionSynchronizationManager• getTransactionReadState: Allows logic conditional on the state of the

transaction.• bindResource and getResource: Bind objects to current transaction.

This is like ThreadLocal but is safer i.e. Resources are bound to the transaction and go away when the transaction is terminated.

• TransactionalResourceHelper• <K,V> getMap(Object resourceKey), etc: Helper to get

transactionally-bound collections.

Page 15: Foundation APIs and Repository Internals

15

Navigating HierarchiesLucene-Driven APIs• SearchService.query()

• Versatile• Not always transactionally consistent• Cluster: Transactions are replayed for indexes

SQL-Driven APIs• SearchService.selectNodes()

• Fast for simple path-based searches only: e.g.:“/app:company_home/app:data_dictionary”

• Always consistent!• NodeService.*

• Use for consistent views• FileFolderService.*

• Limited to cm:folder-based lookups• Fast lookups on cm:name (via NodeService)

Page 16: Foundation APIs and Repository Internals

16

Navigating HierarchiesWalking Child Associations• Hierarchy traversal is fast if the child associations can be isolated.• Put the correct data into createNode

P1

N2

Child Association indexes:• Parent and child• typeQName• qname• Also unique cm:name

• No uniqueness on path QName, but can be used to trim results to a meaningful few.

• Use type QName for better search selectivity.• Use child associations that have

<duplicate>false</duplicate> to enforce uniqueness (cm:name) and use FileFolderService.

Page 17: Foundation APIs and Repository Internals

17

Policies and BehaviourPolicy Behaviour Filters• Temporarily disable polices• Bound to the current transaction• Bean: “behaviourFilter”

cm:versionable• Prevent a change from forcing a new version• Force versioning on metadata change: cm:autoVersionOnUpdateProps

cm:auditable• Manually set cm:modified date• Prevent cm:modified from being recorded

Page 18: Foundation APIs and Repository Internals

18

Alfresco Repository Internals

Application Bootstrap

Server Startup• DDL script execution: lock table• Spring init(): no resources available• Alfresco Bootstrap: order of AbstractLifecycleBean• Module Bootstrap: module startup on

AbstractModuleComponent

Page 19: Foundation APIs and Repository Internals

19

Q & A