Oracle AI - Kxodia 肯佐迪亞

In this lesson, we’ll describe the capabilities of three of the Oracle AI services that provide perceptual AI—Language, Speech, and Vision—and show how they can be used in Data Science notebooks.

OCI Language

該服務可協助客戶使用自然語言處理 (NLP) 功能，透過非結構化文字挖掘出深入見解。開發人員可將經過預先訓練的 NLP 功能整合到應用程式中，無需勞駕資料科學家建立自訂模型。

Detects the language of your text (It recognizes 75 languages )

Identifies entities in text ( things like names, places, dates, emails, currency, organizations, phone numbers – about 14 types in all.)

Identifies sentiment for each aspect of text (for example, The food was great, but the service sucked!)

Identifies key phrases that represent important ideas or subjects

Classifies general topic from list of 600 categories and subcategories (Domain: Animals and Plants, Sub-domain: Pets)

OCI Speech

It unlocks the data in audio tracks by converting speech to text.

Automatically transcribes audio and video files into text using advanced deep learning techniques
No data science expertise is required
Processes data directly in object storage
Generates timestamped, grammatically accurate transcriptions
SRT closed caption file support
Profanity filtering
Punctuates transcriptions

OCI Vision

Image Analysis

Object Detection : Detects objects inside image
Image Classification : Labels the scene

Document AI

Text Recognition : Extracts text from images
Document Classification : Parses documents into 10 different types based on appearance
Language Detection : Analyzes visual features of text, rather than text itself
Table Extraction : Identifies and extracts tables from invoices, POs, and receipts
Key Value Extraction : Finds values for 13 common fields and line items in receipts

How does a data scientist take advantage of these capabilities? You can write code against the REST API or use any of the various language SDKs . But for data scientists working in Data Science, it makes sense to use Python.

6 tips for easier AI adoption

OCI Anomaly Detection

識別時間序列數據(time-series)中的異常，一次發現多個信號的異常

Univariate and Multivariate Detection 單變量和多變量檢測

The multivariate algorithm is called MSET2, which stands for Multivariate State Estimation Technique, and it’s unique to Oracle. The 2 in the name refers to the patented enhancements by Oracle Labs that.

Automatically identifies and fixes data quality issues : fewer false alarms and more accurate results

If signals are determined to be correlated enough, OCI Anomaly Detection will create an internal multivariate model for those signals.

在訓練 OCI Anomaly Detection 模型時，用戶無需指定預期模型是用於多變量(Multivariate)還是單變量(Univariate)數據。它會自動進行此檢測。例如，如果一個模型使用 10 個信號進行訓練，並且這些信號中的 5 個被確定為足夠相關以進行多元異常檢測，它將為這些信號創建一個內部多元模型。如果其他 5 個信號彼此不相關，它將為每個信號創建內部單變量模型。從用戶的角度來看，結果將是 10 個信號的單個 OCI 異常檢測模型，但在內部，根據訓練對信號進行不同的處理。

用戶還可以在單個信號上訓練模型，這將產生一個單變量模型。

Train an OCI Anomaly Detection Model

Obtain training data : Training data contains no anomalies and covers a complete business cycle.

Upload training data to Object Storage : The service requires that the training data be in object storage.

Create a data set for the training data : This tells the service which data will be used.

Train the model : Use the wizard to select training data and set parameters.

模型訓練完成後，最好使用包含異常的數據集對模型進行另一次測試。根據結果用戶可能希望重新訓練模型並指定不同的False Alarm Probability (FAP)。 FAP 是模型產生誤報的概率。 FAP 可以被認為是模型的敏感性。 FAP 越低，它報告錯誤警報的可能性就越小，但它對檢測異常的敏感度就越低。