Polaris »
What Is Polaris?
Polaris has two separate technologies that can be used together or separately: Polaris Near Duplicate Technology ("PolarisND") and Polaris Email Threading Technology ('PolarisET"). PolarisND is a revolutionary tool which identifies, groups, and provides a review order for near-duplicate documents according to similarity thresholds established by a user. PolarisND can be used as a standalone application or can be run in many third-party applications. PolarisND is a product replacement for ALFind.
Polaris Email Threading Technology ("PolarisET") is email threading technology that provides a review order of emails/attachments, and their entire threads, to create a review order allowing users to review entire threads without having to review emails independently, to create more expeditious and efficient attorney reviews. It also captures bibliographic information such as author, recipient, copyee, bcc, and date sent, and more importantly, captures all of conversants throughout a thread, which is a list of all of the names found in the above categories throughout the entire email thread.
Both PolarisND and PolarisET can be purchased on a per document or per gigabyte basis, and enterprise licensing is also available. PolarisND has an API that can be implemented with any third party software application, while the API for PolarisET is coming soon. Either technology can be also be provided as a service and processed by Rosen Tech.
For additional information, to schedule a demonstration or to request Rosen Tech to process a sample set of data, email sales@rosentech.net or contact us at 312.251.4440.
Additional Information
ALCoder Autocoding Software and Polaris Near Duplicate/Email Threading Technology are utilized by law firms, service providers, government agencies and corporations. Below is a list of customers:
- Access Litigation
- Advanced Litigation Services
- AEA Group LLC
- AlphaLit
- Altep
- Andrus, Sceales
- Arkin Kaplan
- Ashbaugh Beal
- Avalon Copy Centers
- Bates & Carey
- Bieging Shapiro
- Bilzin Sumberg
- Blank Rome
- Bollinger, Ruberry & Garvey
- Boston Attorney General's Office
- Bridge City Legal
- Brinks Hofer
- Bromberg and Sunstein LLP
- Burns, White & Hickton
- C2 Legal
- CACI
- Caldwell Leslie
- California Attorney General's Office
- Carney Badley Spellman
- CD Lit/Digital Discovery
- Celerity Consulting Group, Inc.
- Certus EDM
- Childress Duffy Goldblatt
- Chittendon, Murday & Novotny
- Choate Hall & Stewarth
- Comptroller of the Currency
- Connelly, Roberts & McGivney, LLC
- Coughlin Stoia Geller Rudman & Robbins
- CPR Group
- Counselor Resource Group
- Daley Mohan Groble
- Daspin & Aument
- Daticon
- Detroit Legal Imaging
- Dickinson Wright
- DLA Piper
- D4 eDiscovery
- Document Resources
- Document Services Unlimited
- Document Technologies Inc.
- Doffermyre, Shields, Canfield & Knowles
- eLit, Inc.
- ESI Productions
- Evidox
- Evolve Discovery
- Executive Office for US Attorneys
- Fairfield & Woods
- Fish & Richardson
- Foster Pepper PLLC
- Fraser Stryker
- Gaspar Digital
- Geon Legal Solutions - Ireland
- Gowlings Canada, Inc.
- Greenberg Glusker
- Grippo & Elden LLC
- Habeas Corpus Resource Center
- Hafez - Cairo, Egypt
- Hahn Loeser & Parks LLP
- Harper, Kynes, Geller, Greenleaf, Vogelbacher & Frayman, P.A.
- Haynsworth Sinkler Boyd, PA
- Hess Corporation
- Hinkhouse Williams Walsh LLP
- Hobs Legal - UK
- Holme, Roberts & Owen
- Howrey Simon
- iArchives
- Ignited Solutions
- Iris Data Services
- Kasowitz Benson Torres & Friedman
- Kirkland & Ellis
- Kurtz Law Offices
- Labat Anderson
- Laurie & Brennan
- Lawgical Choice
- Legastat, Ltd. - UK
- Lerach Coughlin
- Leydig Voit & Meyer
- Liner Yankelevitz Sunshine & Regenstreif
- Litigation Solutions Inc.
- Litigation Support Consulting
- Locke Lord Bissell Liddell
- Lockheed Martin
- Logik
- Lommen Abdo
- Lowenstein Sandler
- Mark & Associates
- Maslon Edelman Borman & Brand, LLP
- McCoy & Hofbauer
- McGuire Woods
- Miller & Chevalier
- Modus, LLC
- Morris, Downing & Sherred LLP
- National Legal
- New Zealand - Inland Revenue
- NorthStar Litigation
- Office of the Federal Public Defender
- Ogletree Deakins
- Onsite3
- Pangea3
- Polsinelli & Shughart
- Pomerantz Haudek Grossman & Gross LLP
- Precise, Inc.
- Pryor Cashman LLP
- Relevant Evidence
- Renaud Cook Drury Mesaros
- Rinnillo, Inc.
- Scarab
- Securities & Exchange Commission
- Simpson Boyd & Powers
- Stratify
- Studeo Legal
- Sunstein Kann Murphy & Timbers LLP
- Superior Document Services
- Target Litigation
- Teamlegal Solutions
- The Litigation Document Group
- Thomas & Libowitz
- Thompson Hine
- Townsend, Townsend & Crew
- Transperfect Legal Solutions
- Trial Images
- Troutman Sanders
- US Attorneys Office-Eastern District NY
- US Attorneys Office-Eastern District of PA
- USDOJ - Antitrust Division
- USDOJ - Civil Division
- USDOJ - Civil Rights Division
- USDOJ - California Attorney General's Office
- USDOJ - Environmental Division
- USDOJ - North Carolina
- Vorys Slater
- Wolf Greenfield & Sacks
- Wyatt, Tarrant & Combs, LLP
PolarisND identifies, groups, and provides a review order for near-duplicate documents according to similarity thresholds established by a user.
PolarisND allows you to:
- Significantly reduce attorney review time by identifying and grouping similar documents
- Consider the content of documents, rather than format, to determine similarity
- Reveal document relationships beyond those which are available from traditional near-duplicate analysis
PolarisND includes the following feature functions:
- Identification of the one document which is the most representative within each near-duplicate family
- Easy data ingestion from OCR-only databases, extracted electronic text databases, or common load files which point to text files
- Simple, one step process to withhold redundant documents from a PolarisND export
- Identification and grouping of exact duplicates
Our clients use PolarisND data to:
- Include near-duplicate documents when creating review batches
- Enforce consistent review decisions on near-duplicate documents
- Identify exact duplicate documents according to extracted text/OCR
- Reduce the number of documents hosted in a review environment
- Quality Control outgoing productions
PolarisND processing is available in three forms:
- Processing as an outsourced source
- Processing as a behind-the-client-firewall software install via the PolarisND processing User Interface
- Processing within 3rd party applications via PolarisND API integration
PolarisND Fields & Definitions
- DocID: The unique ID of the document being described by the row of data. This is the DocID that was supplied when data was initially loaded.
- ND_IsMaster: Identifies each Family's Master document. The Master document is the document which has been identified as being most representative of the near duplicate Family to which it belongs. Master documents will contain a 'Y' in this field. For all others, it will contain a blank space (' '). This field is quite different from how other near-duplicate technology works, and is unique to PolarisND.
- ND_Similarity: Indicates a document's level of similarity to the Master document of its Family. Master documents are 100% similar to themselves.
- ND_Family: A unique ID for each near-duplicate Family. Families are sets of documents that are all highly similar, if not identical, to each other. The Family ID is composed of two numbers separated by a hyphen. The first part identifies the cluster to which the family belongs. The second part identifies the family within that cluster.
- ND_Cluster: A unique ID for each Cluster. Clusters are a more broad grouping than Families. If two documents belong to a Cluster then they will have some relationship, but it may be loose or indirect.
- ND_ExactDupeSet: Identifies exact duplicates. If two documents share an ExactDupeSet ID, they share the exact same text. This grouping is based solely on textual content, so they may have differing metadata or be in different file formats. Because of this, programs which generate MD5 or SHA hashes may not assign the
- same hashes to documents which belong to the same ExactDupeSet.
- ND_Sort: Provides a sort order based on document similarity. Families and Clusters are kept together, and more closely related documents and families are placed closer together.
- ND_ReviewSet: An ID for each Review Set. PolarisND tries to place more closely-related documents and families, based on similarity, together in the same Review Set for review purposes.
- ND_ResultSet: The name of the result set being exported. This is a user-assigned name which can be used for tracking purposes. By default, PolarisND will create a name based on the date when the review set was generated.
PolarisET Fields & Definitions
- DocID: The unique ID of the document being described by the row of data. This is the DocID that was supplied when data was initially loaded.
- IsMessage: Indicates whether the document was recognized as an email message ('Y'), or some other kind of document ('N').
- MessageID: A unique ID that is assigned to each message. If several documents have the same Message ID, it is because they were recognized as separate copies of the same email message.
- ThreadID: A unique ID assigned to each message thread.
- ThreadSort: A sort order which follows the chain of conversation, from first message to most recent ones. If a conversation splits into multiple branches, ThreadSort will keep each of those branches together.
- ThreadSize: The total number of Messages (not documents -- see 'IsMessage' above) in the thread.
- Parent: The immediate parent of the current email in the thread. For the first message, this will be blank. For each subsequent one, it will be the DocID of the email which it replies to or is a forward of.
- Inclusive: The Inclusive field flags a minimal set of documents which can be viewed in order to read the message thread's entire conversation. In the simplest case this will simply be the last message in the thread, since it will quote all the previous messages. *Note that Inclusiveness is calculated based on message bodies, and does not consider metadata such as an email's subject or attachments.
- Date: The date, as listed in the message's header.
- From: The sender, as listed in the message's header.
- To: The recipients, as listed in the message's header.
- CC: The carbon copy recipients, as listed in the message's header.
- BCC: The blind carbon copy recipients, as listed in the message's header.
- Subject: The subject, as listed in the message's header.
- Attachments: The names of any attached files, as listed in the message's header.
- Conversants: The names of all people mentioned in the From, To, CC, and BCC fields found in the entire email, based on reading the headers of any quoted messages as well as the primary message itself.



