aliss technical reference group november 2017

18

Click here to load reader

Upload: aliss-programme

Post on 23-Jan-2018

44 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: aliss technical reference group november 2017

ALISS

Technical Reference

Group

Page 2: aliss technical reference group november 2017

Agenda

• Introduction to the Health and Social Care Alliance Scotland (the ALLIANCE)

• Information management and quality issues with out-going product• Search• Data quality• Duplication• Incentives • Location

• New website (lessoned learned)• Duplication solution• Data quality solution• Location solution• Relevance ?

• Categorisation• Roadmap

Page 3: aliss technical reference group november 2017

Introduction to the ALLIANCE

Our vision is for a Scotland where people of all ages who are

disabled or living with a long term conditions, and unpaid carers,

have a strong voice and enjoy their right to live well, as equal and

active citizens, free from discrimination, with support and services

that put them at the centre.

• National intermediary, 3rd Sector organisation and strategic

partner of the Scottish Government.

• Membership organisation, over 2,100 members.

• The ALLIANCE delivers its vision through many projects and

programmes.

Page 4: aliss technical reference group november 2017

Search on aliss.org

We allowed people to search “Health and wellbeing

resources”.

Resources included:

Title

Description

URL

Tags

Location - Address and Latitude/Longitude

Page 5: aliss technical reference group november 2017

Search

• Search was done using Elasticsearch.

• Users could search using keywords, location (latitude and longitude), and radius.

• Search was done using a ‘multi match’ query of type ‘most fields’ across the fields ‘title’, ‘description’, ‘tags’ and ‘url’.

• This meant that it done a full text query against each field and calculates a score using TF/IDF relevance, then adds up all the scores to give a total score.

Page 6: aliss technical reference group november 2017

Search

• Previously we used 'best_fields' which gave a score to each field individually then ordered results by the best score of any field.

• However this caused problems whereby resources with small descriptions or perhaps one tag that matched exactly with the search terms would come up higher than resources which were intrinsically better from the point of view of users.

• This tied into some of the problems of data quality we had.

Page 7: aliss technical reference group november 2017

Data quality

• ALISS also used a dataset with 33,000 resources that were not designed for public consumption, created as part of a reporting tool on charities activities whereby they had to provide information on themselves and their programmes.

• Much of the information was out of date, had inappropriate (or missing) descriptions and was generally based on “organisations” rather than specific “resources”.

Page 8: aliss technical reference group november 2017

Duplication

• Many of the resources were also duplicates, but in subtle

ways that we had trouble spotting via automation. E.g.

cases where the title was identical but description, locations

and tags were different.

• A further problem was that even if we could identify these

duplicates it was hard to remove them because many users

of the API filtered by resources only they had added, so

removing one resource added by someone in favour of

another caused complaints from users.

Page 9: aliss technical reference group november 2017

Incentives

Page 10: aliss technical reference group november 2017

Incentives

Page 11: aliss technical reference group november 2017

Locations

• The problem with locations was that the system only stored

locations as latitude and longitude but many users tried to

encourage their resource to show up in more searches by

including multiple, wide ranging locations.

• Also there were some resources that were excellent

examples of health and wellbeing resources that covered

geographical areas. These were available over the phone,

or delivered to users, such as telephone befriending

services or “meals-on-wheels” style services.

Page 12: aliss technical reference group november 2017

User’s perspective

1. Someone searched for “Scouts Dingwall” on Chrome browser.

2. ALISS entry was at top of results.

3. Person clicked on page.

4. Get Involved entry for “1st Dingwall Beaver Scouts. No description, location

is in London, URL takes to Get Involved where no information is held.

5. They then typed “Scouts” & “Dingwall” into ALISS and got 12 unhelpful

results.

6. They clicked on the 10th result and it was a LiU Argyll & Bute entry for 1st

Dingwall Scouts. The URL links to non-existent Facebook page, the tag

has 2 x locations in it, the location points to a mountain.

7. Unable to find useful information the person reported the resource and

asked for advice; “Hi I have a 9 year old son who would like to join the

scouts…”.

8. There is no 1st Dingwall Scout group, its 1st Ross & Sutherland group.

Page 13: aliss technical reference group november 2017

New website

Page 14: aliss technical reference group november 2017

Duplication solution

• We changed our schema to having organisations and services rather than just “resources” as it is much easier to solve duplication problems.

• We have one canonical organisation entry for each organisation, that can be “claimed” by a representative of the organisation.

• If a duplicate of an organisation is incorrectly added it is easier to point to the older or more developed or claimed one as the correct canonical one and delete the other.

• Similarly, service duplicates are easier to deal with in the same fashion.

Page 15: aliss technical reference group november 2017

Poor quality information

solution

• We have changed to only allow ALISS staff to add organisations and services.

• We are training volunteers and staff as moderators to add and edit information.

• We are also encouraging organisations to claim their listings.

• We have created data standards to benchmark minimum quality requirements and help encourage users to input good content.

• We will no longer import mass any datasets that impact thequality of the data.

Page 16: aliss technical reference group november 2017

Location solution

• We allow users to give services both a physical location and an

“Area Served” if required.

• Areas Served are official areas as designated through the use of

the National Records of Scotland, Scottish Postcode Directory

dataset.

• When a user searches with a postcode we are able to match

resources that are marked as serving that area, showing them in

the search results. This means that telephone services or delivery

services can be more easily found in relevant areas without

having an inappropriate or inaccurate physical location.

Page 17: aliss technical reference group november 2017

Relevance

When a user searches ALISS we want to return the best services to help them with their specific problem.

But…

The only two factors we have to put the right listing in front of the user is; Postcode and Search term (either free text or category).

What other considerations could or should we make in this area?

• Claimed organisation relevance factor?• “I found this information useful” relevance factor?

Page 18: aliss technical reference group november 2017

Roadmap

• Categorisation work – ongoing

• Content generation – ongoing

• Launch beta website - February 2018

• Launch API version 3 documentation & Terms of Service -

March 2018

• Business development (marketing, business models?) -

April 2018