alfresco zero day config v0_14

22
Version 0.14 Alfresco “Day Zero” Configuration Guide

Upload: javier-alfaya

Post on 13-Apr-2015

52 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Alfresco Zero Day Config v0_14

Version 0.14

Alfresco “Day Zero” Configuration Guide

Page 2: Alfresco Zero Day Config v0_14

ii

Copyright 2010 by Alfresco and others. Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Alfresco. The trademarks, service marks, logos, or other intellectual property rights of Alfresco and others used in this documentation ("Trademarks") are the property of Alfresco and their respective owners. The furnishing of this document does not give you license to these patents, trademarks, copyrights, or other intellectual property except as expressly provided in any written agreement from Alfresco. The United States export control laws and regulations, including the Export Administration Regulations of the U.S. Department of Commerce, and other applicable laws and regulations apply to this documentation which prohibit the export or re-export of content, products, services, and technology to certain countries and persons. You agree to comply with all export laws, regulations, and restrictions of the United States and any foreign agency or authority and assume sole responsibility for any such unauthorized exportation. You may not use this documentation if you are a competitor of Alfresco, except with Alfresco's prior written consent. In addition, you may not use the documentation for purposes of evaluating its functionality or for any other competitive purposes. This copyright applies to the current version of the licensed program.

Page 3: Alfresco Zero Day Config v0_14

iii

Document History

VERSION DATE AUTHOR DESCRIPTION OF CHANGE

0.1 2010-02-23 Peter Monks First draft released for review by Mark Lugert

0.2 2010-02-24 Peter Monks Second draft released for review by US support, SE and consulting teams

0.3 2010-03-01 Peter Monks Added Alfresco support logo, updated port lists, solidified RHEL validation instructions, third draft released for WW review

0.4 2010-03-25 Peter Monks Added more detailed information on tuning databases Added information on 3rd party app (OpenOffice, ImageMagick, pdf2swf) configuration

0.5 2010-04-12 Peter Monks Added extra UTF8 checks for MySQL, courtesy of Scott Ashcraft. Added version check SQL for MySQL Added SQL validation statements for PostgreSQL, Oracle and MS SQL Server

0.6 2010-04-29 Peter Monks Migrated to Alfresco Documentation Template Submitted to docs team

0.7 2010-05-06 Helen Mullally Copy edit and comments. Please use Track Changes to accept or reject changes.

Page 4: Alfresco Zero Day Config v0_14

iv

0.8 2010-05-06 Peter Monks Reviewed changes and accepted / rejected as appropriate.

0.9 2010-06-14 Peter Monks Added links to Environment Validation tool. Removed (now redundant) appendices.

0.10 2010-08-17 Peter Monks Added note about clustering and db.pool.max Added section on hibernate.jdbc.fetch_size

0.11 2010-08-25 Peter Monks Updated virtual file servers thread pool configuration for v3.2+

0.12 2010-11-22 Peter Monks Added note on JIT compiler exclusions Added recommendation for db.pool.idle Added notes on DB2 Added recommendation for in-transaction indexing Added recommendation for quota calculation

0.13 2010-11-23 Peter Monks Added recommendation for db.pool.validate.query

0.14 2010-12-16 Peter Monks Added recommendation for JVM stack size Added recommendation for index.recovery.mode

Page 5: Alfresco Zero Day Config v0_14

v

Table of Contents INTRODUCTION ................................................................................................... 1  DOCUMENT PURPOSE ............................................................................................... 1  INTENDED AUDIENCE ................................................................................................ 1  GLOSSARY .............................................................................................................. 1  ARCHITECTURE VALIDATION ............................................................................ 3  SUPPORTED STACKS FOR ALFRESCO ......................................................................... 3  HARDWARE ............................................................................................................. 3  

I/O ................................................................................................................................................................................... 3  CPU ................................................................................................................................................................................ 4  

DATABASE ............................................................................................................... 4  Maintenance and tuning ................................................................................................................................................. 4  

OPERATING SYSTEM ................................................................................................ 5  JAVA VIRTUAL MACHINE ........................................................................................... 5  ENVIRONMENT VALIDATION ............................................................................. 7  DAY ZERO CONFIGURATION ............................................................................. 9  JVM TUNING ........................................................................................................... 9  

Increase JVM heap ......................................................................................................................................................... 9  Reduce JVM stack .......................................................................................................................................................... 9  Remove JIT exclusions .................................................................................................................................................. 9  

SET DIR.ROOT TO ABSOLUTE PATH .......................................................................... 10  ENABLE AUTOMATIC SEARCH INDEX RECOVERY ........................................................ 11  DATABASE CONNECTION POOL ................................................................................ 11  

Maximum Size .............................................................................................................................................................. 11  Idle Size ........................................................................................................................................................................ 12  Validation Query ........................................................................................................................................................... 12  

DATABASE FETCH SIZE ........................................................................................... 13  IN-TRANSACTION FULL TEXT INDEXING (OPTIONAL) .................................................... 13  QUOTA CALCULATIONS (OPTIONAL) ......................................................................... 14  APPLICATION SERVER WORKER THREAD POOL (OPTIONAL) ........................................ 14  VIRTUAL FILE SERVER (VFS) WORKER THREAD POOL (OPTIONAL) ............................. 14  SHAREPOINT PROTOCOL WORKER THREAD POOL (OPTIONAL) ................................... 15  JODCONVERTER-BASED OPENOFFICE INTEGRATION ............................................... 15  CONFIGURE OTHER THIRD PARTY APPLICATIONS ....................................................... 15  

Page 6: Alfresco Zero Day Config v0_14
Page 7: Alfresco Zero Day Config v0_14

Introduction

Alfresco Day Zero Configuration Guide 1

Introduction Document purpose By default, Alfrescoʼs configuration is optimized for single user evaluation of Alfresco. This configuration minimizes resource usage at the expense of scalability (particularly scalability in the presence of large concurrent traffic volumes). Therefore, for any other use of Alfresco (including but not limited to: QA, performance / scalability testing, production, production mirror, disaster recovery), Alfresco strongly recommends that additional configuration be performed. This document describes the universal configuration steps that should be taken to achieve this, regardless of the specific Alfresco use case, and before Alfresco is started for the first time. It does not describe the full breadth of Alfresco configuration options that can be leveraged to scale Alfresco in use case specific ways, however – this is described in detail elsewhere (for example, in the product documentation, knowledge base). This document is currently focused on Alfresco 3.3 installations, although many of the recommendations can be applied to earlier versions as well (provided the associated Supported Stack is used, rather than the 3.3 Supported Stack).

Intended audience This document is intended for developers, system administrators, and anyone who is tasked with installing an Alfresco instance, regardless of the intended use of that instance (evaluation, development, test / QA, production).

Glossary The following table describes the terms that are used within this document, each of which has a specific meaning within the context of Alfresco:

TERM DEFINITION

DBA DataBase Administrator – someone who has been trained and certified to administer a specific relational database product. Note: relational databases vary greatly in their capabilities, so it is critical that any DBA be experienced with exactly the database product you are intending to use for Alfresco.

I/O Input/Output – in this document refers to I/O performed by Alfresco to some external software or device (such as the network or a disk subsystem).

OS Operating System

RAM Random Access Memory

Page 8: Alfresco Zero Day Config v0_14

Introduction

2 Alfresco Day Zero Configuration Guide

TERM DEFINITION

CPU Central Processing Unit

VFS Virtual File Server – specifically the functionality in Alfresco that provides access to the repository via CIFS, FTP, NFS and WebDAV

SMTP Simple Mail Transfer Protocol – a widely used protocol used for sending email

IMAP Internet Message Access Protocol – a more modern protocol used for interacting with email servers

JVM Java Virtual Machine

Page 9: Alfresco Zero Day Config v0_14

Architecture validation

Alfresco Day Zero Configuration Guide 3

Architecture validation This section describes the steps required to validate the architecture to ensure that it meets the prerequisites for an Alfresco installation. The following summary shows the steps that are required to validate the architecture:

1. Check the supported stacks list. 2. Optimize the hardware settings. 3. Validate the database. 4. Validate the Operating System. 5. Validate and tune the JVM.

Supported stacks for Alfresco Validate that your environment is on the Supported Stacks for Alfresco list at one of the following locations:

http://www.alfresco.com/services/support/stacks/ (summary matrix)

https://network.alfresco.com/?f=default&o=workspace://SpacesStore/4defa351-68cb-4491-9f23-46fb861ddd05 (comprehensive matrix - requires a subscription to the Alfresco Enterprise Network)

Hardware This section describes how to validate your I/O subsystems and CPU.

I/O One of the primary determinants of Alfrescoʼs performance is I/O. Optimize the following, in priority order: 1. I/O to the relational database Alfresco is configured to use. 2. I/O to the disk subsystem on which the Lucene indexes are stored 3. I/O to the disk subsystem on which the content is stored. In each case, the goal is to minimize the latency (response time) between Alfresco and the storage system, while also maximizing bandwidth. Low latency is particularly important for database I/O, and one rudimentary test of this is to ping the database server from the Alfresco server – round trip times greater than 1ms indicate a sub-optimal network topology or configuration that will adversely impact Alfresco performance. “Jitter” (highly variable round trip times) is also of concern, as that will increase the variability of Alfrescoʼs performance – the standard deviation for round trip times should be less than 0.1ms.

Page 10: Alfresco Zero Day Config v0_14

Architecture validation

4 Alfresco Day Zero Configuration Guide

CPU Alfresco will function correctly on virtually all modern 32bit and 64bit CPUs, however, for production use, Alfresco recommends a clock speed greater than 2.5Ghz to ensure reasonable response times to the end user. Although it is not strictly necessary, a 64bit architecture is also recommended, primarily because it allows the JVM to utilize more memory (RAM) than a 32bit architecture. Note: CPU clock speed is of particular concern for the Sun UltraSPARC architecture, as some current UltraSPARC based servers ship with CPUs that have clock speeds as low as 900Mhz, well below what is required for adequate Alfresco performance! If you intend to use Sun servers for hosting Alfresco, please ensure that all CPUs have a clock speed of at least 2.5Ghz. At the time of writing, this implies that:

• an X or M class Sun server is required, with careful CPU selection to ensure 2.5Ghz (or better) clock speed

• T class servers should not be used, as they do not support CPUs faster than approximately 2Ghz

Understandably, Alfresco is unable to provide specific guidance on Sun server classes, models or configurations, so you should talk with your Sun reseller to confirm that minimum CPU clock speed recommendations will be met.

Database Disclaimer: Alfresco is unable to provide specialized support for maintaining or tuning your relational database. You MUST have an experienced, certified DBA on staff to support your Alfresco installation(s)1.

Maintenance and tuning As with any application that uses a relational database, regular maintenance and tuning of the Alfresco database and schema is a necessity. Specifically, all of the database servers that Alfresco supports require at the very least that some form of index statistics maintenance be performed at frequent, regular2 intervals3. Important note: index maintenance on most databases is an expensive, and in some cases blocking, operation that can severely impact Alfresco performance while in progress. Please consult with your experienced, certified DBA regarding best practices for scheduling these operations in your database. Some examples of some of these commands for specific databases are listed below (note: you must validate the precise commands required in your environment with your DBA – this list is for illustrative purposes only):

1  Typically  this  will  not  be  a  full  time  role  once  the  database  is  configured  and  tuned  and  automated  maintenance  processes  are  in  place.    However  an  experienced,  certified  DBA  is  required  to  get  to  this  point.  

2  Unless  your  DBA  recommends  otherwise,  Alfresco  suggests  performing  this  maintenance  daily.  3  Note:  Relying  on  your  database’s  automated  statistics  gathering  mechanism  may  not  be  optimal  –  consult  an  experienced,  certified  DBA  for  your  database  to  confirm  this.  

Page 11: Alfresco Zero Day Config v0_14

Architecture validation

Alfresco Day Zero Configuration Guide 5

DATABASE EXAMPLE MAINTENANCE COMMANDS

MySQL ANALYZE4 - consult with an experienced, certified MySQL DBA who has InnoDB experience (Alfresco cannot use a MyISAM database and hence an InnoDB-experienced MySQL DBA is required)

PostgreSQL VACUUM and ANALYZE5 – consult with an experienced, certified PostgreSQL DBA

Oracle Depends on version6 – consult with an experienced, certified Oracle DBA

Microsoft SQL Server ALTER INDEX REBUILD7, UPDATE STATISTICS8 – consult with an experienced, certified MS SQL Server DBA

DB2 REORGCHK9, RUNSTATS10 – consult with an experienced, certified DB2 LUW DBA

Operating System You should ensure that your chosen OS has been officially certified for use with Alfresco (refer to the Supported Stacks list for details). Alfresco is not sensitive to changes to the OS configuration, beyond the impact on I/O performance (see I/O on page 3). That said, it is recommended that a 64bit OS be used if the hardware (CPU, and so on) is 64bit capable.

Java Virtual Machine You should ensure that your chosen JDK-enabled Java Virtual Machine has been officially certified for use with Alfresco (refer to the Supported Stacks list for details).

4 http://dev.mysql.com/doc/refman/5.1/en/analyze-­‐table.html 5 http://www.postgresql.org/docs/8.4/static/maintenance.html 6 http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/stats.htm#PFGRF003 7  http://technet.microsoft.com/en-­‐us/library/ms188388.aspx  8  http://technet.microsoft.com/en-­‐us/library/ms187348.aspx  9  http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.admin.cmd.doc/doc/r0001971.html    

10  http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.admin.cmd.doc/doc/r0001980.html    

Page 12: Alfresco Zero Day Config v0_14

Architecture validation

6 Alfresco Day Zero Configuration Guide

For information on configuring and tuning the JVM, refer to the product documentation or the following wiki page:

http://wiki.alfresco.com/wiki/JVM_Tuning Note that Alfresco requires an official Sun 1.6 JDK (or IBM JDK, if using Websphere) – other JVMs (earlier versions, Harmony, gcj, JRockit, HP, and so on) are NOT supported and are known to cause issues in various parts of the product. Alfresco recommends using a 64bit JVM if the underlying platform (OS and hardware) is 64bit capable.

Page 13: Alfresco Zero Day Config v0_14

Environment validation

Alfresco Day Zero Configuration Guide 7

Environment validation The following environment-specific items must be validated prior to installing Alfresco. Note that Alfresco now provides an Environment Validation tool that can validate most of the following requirements. This tool is available at:

https://network.alfresco.com/?f=default&o=workspace://SpacesStore/f98ad411-510d-444f-8166-432a66fe172a

1. Validate that the hostname of the server is resolvable in DNS.11 2. Validate that the user Alfresco will run as can open sufficient file descriptors (4096 or more). 3. Validate that the ports on which Alfresco listens are available12:

o FTP: TCP 2113 o SMTP: TCP 2514 o SMB / NetBT: UDP 137,138, TCP 139,44515 o IMAP: TCP 14316 o SharePoint Protocol: TCP 707017 o Tomcat Administration: TCP 8005 o HTTP: TCP 8080 o RMI: TCP 50500

4. Validate that the installed JVM is Sun version 1.6. 5. Validate that the directory in which the JVM is installed does not contain spaces. 6. Validate that the directory in which Alfresco is installed does not contain spaces. 7. Validate that the directory Alfresco will use for the repository (typically called “alf_data”) is

both readable and writeable by the OS user that the Alfresco process will run as. 8. Validate that you can connect to the database as the Alfresco database user, from the

Alfresco server.18

11  This  is  required  if  Alfresco  is  going  to  be  configured  in  a  cluster.  12  Note:  the  ports  listened  here  are  the  defaults.    If  you’re  planning  on  reconfiguring  Alfresco  to  use  different  ports,  or  are  enabling  additional  protocols  (such  as  HTTPS,  SMTP,  IMAP  or  NFS)  you  should  update  this  list  with  those  port  numbers.  

13  On  Unix-­‐like  OSes  that  offer  so-­‐called  “privileged  ports”,  Alfresco  will  normally  be  unable  to  bind  to  this  port  unless  run  as  the  root  user  (which  is  not  recommended).    In  this  case,  even  if  this  port  is  available,  Alfresco  will  still  fail  to  bind  to  it,  however  for  FTP  services  this  is  a  non-­‐fatal  error  –  Alfresco’s  FTP  functionality  will  simply  be  disabled  in  the  repository.  

14  SMTP  is  not  enabled  by  default.  15  On  Unix-­‐like  OSes  that  offer  so-­‐called  “privileged  ports”,  Alfresco  will  normally  be  unable  to  bind  to  this  port  unless  run  as  the  root  user  (which  is  not  recommended).    In  this  case,  even  if  this  port  is  available,  Alfresco  will  still  fail  to  bind  to  it,  however  for  CIFS  services  this  is  a  non-­‐fatal  error  –  Alfresco’s  CIFS  functionality  will  simply  be  disabled  in  the  repository.  

16  IMAP  is  not  enabled  by  default.  17  Some  of  the  Alfresco  bundles  (specifically  the  WAR,  EAR  and  Tomcat  bundles)  don’t  ship  with  the  SharePoint  Protocol  enabled  by  default.    If  you’re  using  one  of  these  bundles  you  can  ignore  this  port  until/unless  you  install  support  for  the  SharePoint  Protocol.  

18  Note:  this  will  require  installation  of  the  database  vendor’s  “client  tools”  on  the  Alfresco  server.  

Page 14: Alfresco Zero Day Config v0_14

Environment validation

8 Alfresco Day Zero Configuration Guide

9. Validate that the character encoding for the Alfresco database is UTF-8. 10. (MySQL only) Validate that the default storage engine for the database server that Alfresco

will use is InnoDB19. 11. Validate that the following third-party software is installed and the correct versions:

o OpenOffice v3.1 or newer o ImageMagick v6.2 or newer

12. (RHEL and Solaris only) Validate that OpenOffice is able to run in headless mode. Refer to the appendices in this document for OS and database-specific commands that can be used to perform these validations.

19  Not  required  as  of  Alfresco  3.3.  

Page 15: Alfresco Zero Day Config v0_14

Day zero configuration

Alfresco Day Zero Configuration Guide 9

Day zero configuration This section describes the configuration changes that will improve Alfrescoʼs reliability, stability and performance when used for anything other than single user evaluation purposes.

JVM tuning Note: the following recommendations are the bare minimum reconfiguration required by Alfresco, but further tuning of the JVM may be necessary depending on your use of Alfresco. Refer to the product documentation or the following wiki page.

http://wiki.alfresco.com/wiki/JVM_Tuning With the exception of the settings listed here, it is not recommended to set any JVM parameter without first analyzing the running JVM and experimentally verifying that the change definitively improves the behavior of Alfresco for your use case.

JVM tuning is a highly environment and use case specific activity, and it is trivially easy to destroy the JVMʼs inherent reliability and scalability with uninformed changes to the JVM settings.

Increase JVM heap The startup script for Alfresco20 is optimized for single user evaluations of Alfresco on developer class desktop PCs, and therefore configures a very small heap for the JVM. For any other usage (including any kind of multi-user evaluation, testing, or production configuration), Alfresco requires that the JVM heap size be increased to at least 1GB (32bit JVM) or 2GB (64bit JVM):

export JAVA_OPTS="-XX:MaxPermSize=512m -Xms256m –Xmx1024m"

Reduce JVM stack The default per-thread stack in the JVM varies, depending on the OS, JVM and JVM version in use, but in general is larger than Alfresco requires (up to 8MB in some cases). Because memory for the stack comes out of the JVMʼs heap and Alfresco can potentially have several hundred threads active at once, it is recommended to reduce the stack size, freeing up more of the heap for other purposes:

export JAVA_OPTS="-XX:MaxPermSize=512m -Xms256m –Xmx1024m –Xss128k"

If, as a result of making this change, you start seeing java.lang.StackOverflowError exceptions in the Alfresco log, you may increase this value in 128k increments until the exceptions disappear.

Remove JIT exclusions Early builds of the 1.5 JDK had JIT compiler bugs that impacted Alfresco (specifically around the Lucene search engine). By default, the Alfresco start up script disables JIT compilation for

20    ${ALFRESCO_HOME}/alfresco.sh  or  %ALFRESCO_HOME%\alfresco.bat  in  versions  up  to  and  including  3.3SP2,  ${ALFRESCO_HOME}/tomcat/scripts/ctl.sh  or  %ALFRESCO_HOME%\tomcat\scripts\ctl.bat  in  versions  3.3SP3  and  above  that  use  Tomcat.  

Page 16: Alfresco Zero Day Config v0_14

Day zero configuration

10 Alfresco Day Zero Configuration Guide

these classes, something that is no longer relevant now that Alfresco only supports JDK 1.6 and above. Double check that these JIT exclusions are commented out in the startup script, as follows (note the highlighted comment symbol):

# Following only needed for Sun JVMs before to 1.5 update 8 #export JAVA_OPTS="${JAVA_OPTS} -XX:CompileCommand=exclude,org/apache/lucene/index/IndexReader\$1,doBody -XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/IndexInfo\$Merger,mergeIndexes -XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/IndexInfo\$Merger,mergeDeletions"

On Windows, the “rem” command should be used in place of the Unix-shell “#” comment symbol. Note: newer versions of Alfresco (3.3+) no longer include this option in the start up script so donʼt be surprised if it is not present.

Set dir.root to absolute path The dir.root property controls which directories on disk that Alfresco uses to store content and the Lucene indexes. Alfresco is unaware of the specific file system layout on your server, therefore the default configuration sets the dir.root property to the following relative path:

./alf_data

It is strongly recommended that you always set this value to an absolute file system path before starting Alfresco for the first time. This ensures that no matter how the Alfresco instance is started, it will always find the directories where content has previously been written. With Tomcat, this property is found in: ${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties21

If you do not set dir.root to an absolute path, you may see a “CONTENT INTEGRITY ERROR” message in the alfresco.log file during a second or subsequent startup of the server.

Other than being an absolute path, Alfresco has no specific requirements for where this directory resides or what it is called. You should optimize the location of the file system portion of the Alfresco repository to maximize I/O performance (as mentioned in I/O on page 3)).

21  As  of  Alfresco  Enterprise  3.2.0  –  in  earlier  versions  this  property  is  found  in  ${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-repository.properties  

Page 17: Alfresco Zero Day Config v0_14

Day zero configuration

Alfresco Day Zero Configuration Guide 11

Enable automatic search index recovery By default Alfresco validates that the search indexes are valid and complete during startup, but does not attempt to recover them if problems are detected. To ensure missing indexes are recovered, find the index.recovery.mode property in:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

uncomment it, and set it to the value AUTO: # # Index Recovery Mode #------------- index.recovery.mode=AUTO

If the index.recovery.mode property is not visible in alfresco-global.properties, it can be added anywhere in the file.

Database connection pool

Maximum Size By default, each Alfresco instance is configured to use up to a maximum of 4022 database connections. All operations in Alfresco require a database connection, which places a hard upper limit on the amount of concurrent requests a single Alfresco instance can service (that is,. 40), from all protocols, by default. Most Java application servers have higher default settings for concurrent access23, and this, coupled with other threads in Alfresco (non-HTTP protocol threads, background jobs, and so on) can quickly result in excessive contention for database connections within Alfresco, manifesting as poor performance for users. If you are using Alfresco in anything other than a single user evaluation mode, increase the maximum size of the database connection pool to at least [number of application server worker threads] + 75. For a Tomcat 6 default HTTP worker thread configuration, and with all other Alfresco thread pools left at the defaults, this means this property should be set to at least 275. To increase the database connection pool, add the db.pool.max property to:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and set it to the recommended value: db.pool.max=275

You may add this property anywhere in the file, although for clarity you should place it immediately after the other database properties. Important note: after increasing the size of the Alfresco database connection pools, you must also increase the number of concurrent connections your database can handle to at least the size of the cumulative Alfresco connection pools24. In fact Alfresco recommends configuring at 22  As  of  Alfresco  Enterprise  3.2r  –  this  number  may  change  in  future  versions.  23  Tomcat  6.0,  for  example,  allows  up  to  200  concurrent  HTTP  requests  by  default.  24  In  a  cluster  each  node  has  its  own  independent  database  connection  pool.    You  must  configure  sufficient  database  connections  for  all  of  the  Alfresco  cluster  nodes  to  be  able  to  connect  simultaneously.  

Page 18: Alfresco Zero Day Config v0_14

Day zero configuration

12 Alfresco Day Zero Configuration Guide

least 10 more connections to the database than are configured cumulatively across all of the Alfresco connection pools, to ensure that you can still connect to the database even if Alfresco saturates its own connection pools. Do not forget to factor in cluster nodes (which can each use up to 275 database connections) as well as connections required by other applications that are using the same database server as Alfresco. The precise mechanism for reconfiguring your databaseʼs connection limit depends on the relational database product you are using; your DBA should be able to readily configure this.

Idle Size By default, each Alfresco instance will, when idle, reduce the size of the database connection pool to no more than 8 open connections at any time, in order to minimize resource usage in both the JVM and the database. While appropriate for evaluation and individual developer environments, this setting is not appropriate for any kind of multi-user or high traffic installation, including but not limited to QA, performance / scalability test, production mirror and production environments. For these environments Alfresco recommends disabling the idle connection reclamation logic in the database connection pool, by adding the db.pool.idle property to:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and set it to the “disabled” value of -1: db.pool.idle=-1

Validation Query By default Alfresco does not periodically validate each database connection retrieved from the database connection pool. Validating connections is, however, very important for long running Alfresco servers, since there are various ways database connections can unexpectedly be closed (for example by transient network glitches and database server timeouts). Enabling periodic validation of database connections involves adding the db.pool.validate.query property to:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and setting it to one of the following values, depending on the database thatʼs in use:

DATABASE VALUE FOR db.pool.validate.query

MySQL /* PING */

PostgreSQL SELECT VERSION()

Oracle SELECT 1 FROM DUAL

Microsoft SQL Server

SELECT 1

Page 19: Alfresco Zero Day Config v0_14

Day zero configuration

Alfresco Day Zero Configuration Guide 13

DATABASE VALUE FOR db.pool.validate.query

DB2 SELECT CURRENT DATE FROM SYSIBM.SYSDUMMY1

Database fetch size Some databases25 return data in batches of rows, each of which requires a separate network round trip between the Alfresco server and the database. For JDBC (the technology Alfresco uses to connect to the relational database), the size of these batches is controlled via a parameter called the “fetch size”.

Alfresco uses the default fetch size defined by the configured JDBC driver, and for most JDBC drivers the default fetch size is 10. The net effect is that any time Alfresco issues a query that returns more than 10 rows, the database will break up that result set into batches of 10 rows each, potentially incurring a large number of network round trips for a single SQL result set. In this case even very small (sub-millisecond) latencies between the Alfresco and database servers can be massively multiplied, resulting in poor performance, but no obvious symptoms (CPU usage will be minimal for both Alfresco and the database, the network will not be saturated, the SQL queries themselves may be executed efficiently by the database etc.). For this reason Alfresco recommends changing the JDBC fetch size to a larger number – to do this, add the hibernate.jdbc.fetch_size property to: ${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and set it to a large value (such as 150): hibernate.jdbc.fetch_size=150

You may add this property anywhere in the file, although for clarity you should place it immediately after the other database properties.

In-transaction full text indexing (optional) By default, Alfresco will perform full text indexing atomically within each repository transaction, provided that producing full text from the content takes less than 20ms. In many cases, it is appropriate to force all full text extraction and indexing to occur asynchronously, in a separate background transaction, resulting in savings of up to 20ms for each user-initiated transaction. To accomplish this, add the lucene.maxAtomicTransformationTime property to: ${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and set it to 0: lucene.maxAtomicTransformationTime=0

Important note: this setting increases the chances of short-term staleness in the full text indexes. This possibility exists anyway (any full text extraction that takes 20ms or more will, by default, occur asynchronously anyway), but this setting increases that behavior. 25  Notably  Oracle.  

Page 20: Alfresco Zero Day Config v0_14

Day zero configuration

14 Alfresco Day Zero Configuration Guide

Quota calculations (optional) The logic that calculates quotas and detects violations executes in-transaction (i.e. impacts every single write to the repository), is quite expensive (it requires recursive calculation of document / space sizes) and incurs some overhead even when a particular user doesnʼt have a content quota defined. If there is no requirement to use content quotas, this overhead can be eliminated by globally disabling quota calculations in the repository: To accomplish this, add the system.usages.enabled property to: ${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

and set it to false: system.usages.enabled=false

Important note: this setting globally disables quota calculations – the functionality is completely disabled in this installation of Alfresco. For that reason this setting should not be used if there is any requirement to use content quotas in this Alfresco instance. It can, however, be turned back on at a later date with no side effects (beyond the expected impact on Alfresco performance).

Application server worker thread pool (optional) Most Java application servers have a worker thread pool that is sized significantly larger than the default Alfresco database connection pool and this configuration can lead to excessive blocking and even deadlocks within Alfresco.

For low or moderate concurrency Alfresco instances, an alternative to increasing the size of the database connection pool is to reduce the number of application server worker threads. How this is done varies between application servers, but for Tomcat 6.0 bundled with Alfresco, this is configured in:

${ALFRESCO_HOME}/tomcat/conf/server.xml26

Virtual File Server (VFS) worker thread pool (optional) For low or moderate concurrency Alfresco instances, you might also consider reducing the number of Alfresco VFS worker threads (these threads provide CIFS, FTP, and NFS services for a repository). By default, Alfresco will start 25 VFS worker threads, and this pool may grow up to 50 worker threads in total27. To tune either or both of these properties, copy the following configuration file:

${ALFRESCO_HOME}/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/subsystems/fileServers/default/file-servers-context.xml

to:

26  More  information  is  available  at  http://tomcat.apache.org/tomcat-­‐6.0-­‐doc/config/index.html.  27  More  information  is  available  at  http://wiki.alfresco.com/wiki/File_Server_Configuration#Advanced_Server_Configuration  

Page 21: Alfresco Zero Day Config v0_14

Day zero configuration

Alfresco Day Zero Configuration Guide 15

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-file-servers-context.xml

Remove all of the <bean> definitions except for the bean with the id “fileServerConfiguration”. Add the following property block to the “fileServerConfiguration” bean:

<property name="coreServerConfigBean" ref="coreServerConfig" />

Add the following bean definition after the “fileServerConfiguration” bean: <bean id="coreServerConfig" class="org.alfresco.filesys.config.CoreServerConfigBean"> <property name="threadPoolInit" value="25"/> <property name="threadPoolMax" value="50"/> </bean>

SharePoint Protocol worker thread pool (optional) ###TODO!!!!

JODConverter-based OpenOffice integration As of Alfresco v3.2, Alfresco uses the JODConverter framework28 for integrating with OpenOffice, although by default the original non-JODConverter integration is configured instead. It is highly recommended to reconfigure Alfresco to use the new JODConverter based integration in place of the original integration. The properties are found in:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

In that file, set the following properties to these values29: ooo.exe=soffice jodconverter.officeHome=[path to OpenOffice install directory] jodconverter.portNumbers=8101 ooo.enabled=false jodconverter.enabled=true

Configure other third party applications Unless they are in the system PATH, Alfresco needs to be configured with the absolute locations of the ImageMagick and pdf2swf tools that are used by Alfresco for handling various file formats. The properties are found in:

${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties

In that file, set the following properties to these values30,31:

28  http://www.artofsolving.com/opensource/jodconverter  29  Refer  to  the  product  documentation.  30  http://wiki.alfresco.com/wiki/ImageMagick_Configuration  31  http://wiki.alfresco.com/wiki/Installing_Alfresco_components#Installing_SWFTools  

Page 22: Alfresco Zero Day Config v0_14

Day zero configuration

16 Alfresco Day Zero Configuration Guide

img.root=[path to ImageMagick install directory] swf.exe=[path to pdf2swf executable file]

Please take careful note that the first property points to the directory into which the ImageMagick is installed, whereas the second property points to the pdf2swf executable file.