1. What is a 'Dataset' in Amorphic Data?
Dataset is logical name given by a Amorphic user to a group of one more same kind of files. The supported file types are:
- CSV files
- Excel documents
- Image files
- Video files
- Audio Files
- PDF documents
- TXT documents
2. What is the difference between ‘Public Description’ vs ‘Internal Description’?
Public Description is visible to all the users in the platform whereas Internal Description is visible to owners and viewers of that data set.
3. Can I limit the data set search results based on my choice?
Yes, You should generally choose ‘public’ unless you have a very specific reason to hide information. Amorphic Data promotes the reuse of data and does this by making data visible to a wide range of users. Definitions:
- Public: All fields are visible in search results.
- Private: Only the public name and description of the dataset are visible in search results (all other fields are private to the owners and viewers of the dataset).
- None: Hides all fields from search results.
4. What is a ‘Domain’ in Amorphic Data?
‘Domain’ is analogous to a ‘Business Unit’ in traditional system. It can be simply as a group of data sets for a specific need.
5. What are the ways to import data to Amorphic Data?
- API based: You can upload a file directly in browser
- JDBC: You can connect to any of the JDBC sources
- S3: You can connect an existing AWS Account’s S3
6. Can I tag my data set?
Yes, you can tag the data set while registering your data set. You can provide a list of tags in a comma separated format under ‘Keyword’ section
7. Can I append data to my existing data set?
Yes, Amorphic Data, being a data lake, is an append-only model by default. This means that you can add data to your data set but never remove. Data in the data lake is immutable. Often, you only care about the most recent records for a given key and not the whole history (such as with transactional data from an ERP system). Amorphic can automatically create a view of your data which will show you only the most recent records and not the entire history. Select ‘Latest Record’ to have Amorphic create such a view for you. You must also select that your dataset is for analytic purposes. Later, after defining the schema, you will be asked to indicate which fields form the keys for your records and which indicate the latest record
8. Can I tag my data set to indicate that the data set contains confidential information?
Yes, At the time of Data set registration, if your data contains Personally Identifiable Information (PII) or Credit Card data (PCI), please flag it under ‘PROTECTED DATA’. Please take extra care with these kinds of data and consider carefully if they should be uploaded to Amorphic Data. Check with your local Information Security team if you have any questions. You are responsible for the datasets that you create and the data that you upload to Amorphic Data.
9. Can I get notifications for my data sets?
Yes, there are 2 ways you can set your notification preferences:
- All: You will receive emails for both success and failures uploading data to Amorphic Data.
- Error: You will receive emails only when an uploaded file fails to load to Amorphic Data. By default, you will get notifications about Access Request, Access Grant etc.
10. Can I create ‘New Domain’ in Amorphic Data?
The create ‘New Domain’ activity is limited to the administrator of the platform. Please contact your ‘admin’ for adding a new domain to the platform.
11. Can I modify the data dictionary after I create a ‘data set’?
Yes, You have the ability to modify some fields in the data dictionary such as: Public Description, Internal Description, Protected data tags and Search Visibility.
12. How many files can be ingested to a data set?
There are no limits on how many files can be added to a data set.
13. Can I delete a data set?
Yes. You can tag a data set for deletion if you have ‘OWNER’ access the data set. The physical deletion is carried out by an administrator.
14. Is there any limit on how many data sets can be created?
No. There is no limit on the no of data sets.
15. How can I connect my Business Intelligence tool to analyze and visualize my data?
‘Connection Details’ under the data set details provides the detailed instructions to connect to that data set. Connection details will have the following information:
- Connection-string
- Host
- Port
- Database
- Table Name