improving performance and flexibility of content listings using criteria api
Post on 23-Jan-2017
250 Views
Preview:
TRANSCRIPT
Improving Performance and Flexibility of Content Listings Using Criteria API Nils Breunese
Public Broadcaster since 1926 The Netherlands
Online since 1994 Open-source CMS released in 1997
Using Magnolia since 2010 Still migrating websites
Tens of thousands of pages Multiple sites like that
Overview pages Lots of them
Thanks for the warning… Even 10 seconds would be way too long
WARN info.magnolia.module.cache.filter.CacheFilter -- The following URL took longer than 10 seconds (63969 ms) to render. This might cause timeout exceptions on other requests to the same URI.
Overview models Standard Templating Kit
Tracking back from the template newsOverview.ftl
(...) [#assign pager = model.pager] [#assign newsList = cmsfn.asContentMapList(pager.pageItems)!] (...)
Constructing the pager AbstractItemListModel
public STKPager getPager() throws RepositoryException { (...) return new STKPager(currentPageLink, getItems(), content); }
Four step pipeline AbstractItemListModel
public Collection<Node> getItems() throws RepositoryException { List<Node> itemsList = search(); this.filter(itemsList); this.sort(itemsList); itemsList = this.shrink(itemsList); return itemsList;}
1
23
4
Step 1a: Constructing the query TemplateCategoryUtil
public static List<Node> getContentListByTemplateNames(...) { (...) StringBuffer sql = new StringBuffer( "select * from nt:base where jcr:path like '" + path + "/%'"); (...add 'mgnl:template=' clauses...) (...add 'ORDER BY' clauses...) return getWrappedNodesFromQuery(sql.toString(), repository, maxResultSize); } maxResultSize == Integer.MAX_VALUE
Step 1b: Executing the query TemplateCategoryUtil
public static List<Node> getContentListByTemplateNames(...) { (...) NodeIterator items = QueryUtil.search( repository, sql.toString(), Query.SQL, NodeTypes.Content.NAME); }
Step 2: Filtering the item list STKDateContentUtil
public static void filterDateContentList(...) { CollectionUtils.filter(itemsList, new Predicate() { @Override public boolean evaluate(Object object) { (...) return date.after(minDate) && date.before(maxDate); } });}
Step 3: Time to sort STKDateContentUtil
public static void sortDateContentList(...) { Collections.sort(itemsList, new Comparator<Node>() { @Override public int compare(Node c1, Node c2) { (...) if (StringUtils.equals(sortDirection, ASCENDING)) { return date2.compareTo(date1); } return date1.compareTo(date2); } });}
Step 4: Shrinking the list STKTemplatingFunctions
public List<Node> cutList(List<Node> itemsList, final int maxResults) { if (itemsList.size() > maxResults) { return itemsList.subList(0, maxResults); } return itemsList;}
NewsOverviewModel passes Integer.MAX_VALUE, so shrink does effectively nothing in this case
Step 5: Get the items from the pager STKPager
public Collection getPageItems() { Collection subList = items; int offset = getOffset(); if (count > 0) { int limit = maxResultsPerPage + offset; if (items.size() < limit) { limit = count; } subList = ((List) items).subList(offset, limit); } return subList;}
maxResultsPerPage is typically something like 20
When this becomes a problem We have multiple sites like this
select * from nt:base where jcr:path like '/siteX/news/%' AND
mgnl:template = 'standard-templating-kit:pages/stkNews'
20000 pages under website:/siteX/news
Four step pipeline returns STKPager with 20000 items (page nodes)
[#assign model.pager]
[#assign newsList = cmsfn.asContentMapList(pager.pageItems)!]
STKPager returns list with 20 page nodes
19980 Node objects created, but not rendered
A query could do all steps at once JCR queries are pretty flexible
Everything in a single JCR query Only 20 nodes returned
SELECT * FROM nt:base WHERE jcr:path LIKE '/siteX/news/%' AND
mgnl:template = 'standard-templating-kit:pages/stkNews'
AND jcr:created < cast('2016-06-07T00:00:00.000Z' AS DATE)
ORDER BY date ASCENDING
LIMIT 20 OFFSET 20
Search
Filter
Sort
Paging
Criteria API For those familiar with Hibernate/JPA
Criteria criteria = JCRCriteriaFactory.createCriteria() .setBasePath("/siteX/news") .add(Restrictions.eq( "@mgnl:template", "standard-templating-kit:pages/stkNews")) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(20, 1); ResultIterator<...> items = criteria.execute(session).getItems();
SortPaging
Filter
Search
Criteria API for Magnolia CMS Magnolia module by Openmind
Custom pager Only a single page worth of items
public class VtkPager<T> extends STKPager { private final List<? extends T> items; private final int pageSize; private final int count; (...) @Override public List<? extends T> getPageItems() { return items; } }
Use it in your model classes VtkContentListModel (vpro)
public abstract class VtkContentListModel ... { protected final VtkPager<ContentMap> pager; @Override public String execute() { pager = createPager(); return super.execute(); } protected abstract VtkPager<T> createPager(); (...) }
Concrete Example VtkNewsOverviewModel (vpro)
@Overrideprotected VtkPager<Node> createPager() { (...) AdvancedResult result = JCRCriteriaFactory.createCriteria() .setBasePath(path) .add(Restrictions.in("@mgnl:template", templates)) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(itemsPerPage, pageNumberStartingFromOne) .execute(session);
List<Node> items = new ArrayList<>(); for (AdvancedResultItem item : result.getItems()) { items.add(item.getJCRNode()); } int count = result.getTotalSize(); return new VtkPager<>(link, items, content, itemsPerPage, count); }
Still this. Was it all for nothing? :o(
WARN info.magnolia.module.cache.filter.CacheFilter -- The following URL took longer than 10 seconds (63969 ms) to render. This might cause timeout exceptions on other requests to the same URI.
Example VtkNewsOverviewModel (vpro)
@Overrideprotected VtkPager<Node> createPager() { (...) AdvancedResult result = JCRCriteriaFactory.createCriteria() .setBasePath(path) .add(Restrictions.in("@mgnl:template", templates)) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(itemsPerPage, pageNumberStartingFromOne) .execute(session);
List<Node> items = new ArrayList<>(); for (AdvancedResultItem item : result.getItems()) { items.add(item.getJCRNode()); } int count = result.getTotalSize(); return new VtkPager<>(link, items, content, itemsPerPage, count); }
This call takes 10-60+ seconds!
AdvancedResultImpl (jcr-criteria)
@Overridepublic int getTotalSize() { if (totalResults == null) { int queryTotalSize = -1; try { // jcrQueryResult instanceof JackrabbitQueryResult) { Method m = jcrQueryResult.getClass().getMethod("getTotalSize"); queryTotalSize = (int) m.invoke(jcrQueryResult); } catch (InvocationTargetException | IllegalAccessException e) { LOG.error(e.getMessage(), e); } catch (NoSuchMethodException e) { } if (queryTotalSize == -1 && (itemsPerPage == 0 || applyLocalPaging)) { try { totalResults = (int) jcrQueryResult.getNodes().getSize(); } catch (RepositoryException e) { // ignore, the standard total size will be returned } } if (queryTotalSize == -1) { totalResults = queryCounter.getAsInt(); } else { totalResults = queryTotalSize; } } return totalResults; }
We end up here
jackrabbit-core 2.8.0
protected void getResults(long size) throws RepositoryException { (...) result = executeQuery(maxResultSize); // Lucene query (...) // Doesn’t use result.getSize(), call collectScoreNodes(...) }
private void collectScoreNodes(...) { while (collector.size() < maxResults) { ScoreNode[] sn = hits.nextScoreNodes(); (...) // check access if (isAccessGranted(sn)) { collector.add(sn); } else { invalid++; } }} QueryResultImpl
It used to be fast! https://issues.apache.org/jira/browse/JCR-3858
jackrabbit-core 2.10.0+
protected void getResults(long size) throws RepositoryException { (...) if (sizeEstimate) { numResults = result.getSize(); // Use count from Lucene } else { // do things the Jackrabbit 2.8.0 way (...) } (...) }
QueryResultImpl
Enable Jackrabbit’s 'sizeEstimate' Jackrabbit 2.10+
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> (...) <param name="sizeEstimate" value="true"/></SearchIndex>
Rendering times down to 1-2 seconds Bingo
Time for questions
Anyone?
top related