query log analyzer · 2019. 10. 16. · b. waiting to execute query + waiting for locks c. waiting...

Post on 20-Jan-2021

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Improve Query Performancewith the Query Log AnalyzerKees VegterField Engineer kees@neo4j.com

Query Log Analyzer

2

Query Log dbms.logs.query.enabled=true# If the execution of query takes more time than this threshold,# the query is logged. If set to zero then all queriesdbms.logs.query.threshold=100msdbms.logs.query.parameter_logging_enabled=truedbms.logs.query.time_logging_enabled=truedbms.logs.query.allocation_logging_enabled=truedbms.logs.query.page_logging_enabled=truedbms.track_query_cpu_time=truedbms.track_query_allocation=true

3

Query Log AnalyzerQuery Analysis

4

Query Log AnalyzerQuery Log: Filter

5

Query Log AnalyzerQuery Log: Highlight

6

Query Log AnalyzerQuery Timeline

7

Cypher Query Processing

Cypher Planning

Cypher ExecutionPhysical Execution Plan

Query String ParseLogical Plan

Physical Execution Plan

Execute Physical Plan in Cypher Runtime

Query Plan Cache

Query String Execute Physical Plan in Cypher Runtime

uses db-statistics

Use queryparameters!

Use repeatable statements!

8

Cypher Execution

Cypher Planning

Query Load

MATCH p=(ah:AccountHolder {fullName :$accountName }) -[:HAS_BANKACCOUNT]->(ba)-[:SEND*2..16]->() WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() RETURN p, p2 SKIP 0 LIMIT 1000Query Log Analyzer

9

Cypher Planning Cypher Planning

● Parameter Usage○ Check the tool header

○ Check for parameter usage in your queries

● Planning time

1775 queries analysed, 302 distinct queries found.

1775 queries analysed, 1775 distinct queries found.

MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName...RETURN ah

MATCH (ah:AccountHolder) WHERE ah.fullName = "John Smith"...RETURN ah

Cypher Execution

● Page Cache (data cache)

● Waiting for Locks

● Memory Footprint

10

Cypher Execution24 % : read from Cache76 % : read from Disk

● Locking

● Concurrent Load

● Big Result Sets

11

Query Load Query Load

Query Tuning Tips

12

Query Tuning

13

Query Tuning Use Explain and Profile

Things to check:● Index usage● Eager● NodeByLabelScan● AllNodesScan

14

Query TuningAvoid Cartesian Products

…OPTIONAL MATCHOPTIONAL MATCHOPTIONAL MATCH...

MATCH (a), (b), (c)RETURN a, b, c

…UNWIND arrA as aUNWIND arrB as bUNWIND arrC as c...

Use WITH and COLLECT and DISTINCT to reduce the intermediate resultsUse Pattern Comprehension when applicable:

MATCH (a)RETURN

{ a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] }

15

Query Tuning Reduce the query working set as soon as possible

● Can I move a DISTINCT to an earlier point in the query?

● Can I move a LIMIT to an earlier point in the query?

● Can I use COLLECT on places in the query to reduce the amount of rows to be processed?

16

Query Tuning Query Execution

Query Tuning

● Try to send ‘repeatable’ statements

MERGE (author1:Author {id: 1}) MERGE (author2:Author {id: 2})... MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"})...MERGE (author1)-[:WROTE]->(book1)MERGE (author2)-[:WROTE]->(book2)...

MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)

17

Query Tuning Query Execution

Query Tuning

● Reduce the amount of statements you send to Neo4j by using 'batch' statements

UNWIND $inputList as rowMERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)

FOR EVERY 100 ENTRIES IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J

{ inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] }

MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)

FOR EVERY ENTRY IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J

{ authorId : 1, bookTitle : "title1" }

18

Query Tuning Query Execution

Query Tuning

● Use apoc.periodic.iterate with the config parameter iterateList : true !

CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row','MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)',{batchSize : 100, iterateList: true })

● kettle also uses this 'batch' approach

19

Tool Usage● The Query Log Analyzer is meant to be used during development and testing!

● When you have only a command prompt available on a neo4j server you can also use the following tool to do a quick analysis of the query.log file:

https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/

This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time.

20

Next Version● Supports Neo4j version 4 (multi db)

● List Current queries

● List Query Stats (version 3.5.4 and higher)

● Explain Plan

Still under development

21

Multi db support

preview, still under development

22

Current Queries

preview, still under development

23

Queries Stats

preview, still under development

24

Explain Plan

preview, still under development

Useful links

25

Introducing the Query Log Analyzerhttps://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6

Cypher Query Optimisationshttps://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c

Script to get the top 10 most expensive queries from the command linehttps://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/

Hunger Games Questions for"Improve Query Performance with Query Log Analyzer"

1. Easy: What does Avg Waiting stand for?a. Waiting to execute queryb. Waiting to execute query + waiting for locksc. Waiting for locks

2. Medium: What is the correct order of steps in The Cypher Query Processing a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtimec. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime

3. Hard: What is the name of config parameter in apoc.periodic.iterate to make batch updates possible?

Answer here: r.neo4j.com/hunger-games

Q & A

27

Query Log Analyzerinstall

https://install.graphapp.io/

top related