With all the news around ChatGPT lately, I thought I would see how it performed when asked questions about SAS 9 Metadata Security. This is a niche specialty with some complex rules when you get down into the details, so I was not expecting it to do very well. I was both impressed and disappointed. I was impressed with how it was quickly able to respond with confident sounding, generally correct, responses when asked high level questions on this specialized topic. I was disappointed that it provided confident sounding, incorrect, responses when asked about the details.
Here are some excerpts from my chat. It was a long chat so I have only selected some of the responses for this blog post. It’s still a long blog post though, so I will be very impressed if you make it all the way to the end!
First question:
This is a good start, it mentions the SAS Metadata Server, users, groups, access control, and metadata objects, how permissions can be assigned at different levels, and how they are distinct from file system permissions. Some metadata permissions are listed, all of which do exist. I could get pedantic about Read, Write and Delete definitions but at a high level they will do. The Administer permission definition is just wrong.
Next question:
It’s hard to fault this one. An impressive array of quality resources.
Next question:
Another good answer, albeit not quite what I was hoping for. These are good generic security practices. I was hoping for a reference to the Danish Golden Rules or GEL rules mentioned in this post.
It seemed very good at providing confident and mostly correct responses to high level topics. Now it is time to start getting more detailed:
A good answer on a permission for reading the metadata on a metadata object (as distinct from the data behind the metadata object). It covers that there are other permissions for other aspects and it does not cover file system permissions. I’m not so sure about the bit about running “PROC CONTENTS and PROC PRINT statements on the metadata objects” but it’s hard to quibble at this point.
Next question, can it explain the difference between RM and R:
Another good answer. Impressive.
Next question:
It looks like a good answer but is missing some nuance. The bit about ReadMetadata is correct but the bit about Read depends on the circumstances. For example is the Metadata Libname Engine involved or not? However, for a short answer it is a good starting point for further discussions.
How does it handle the permissions flipped?
At a glance this is correct – they can see the underlying data but not the metadata. But whether they can use that ability to see the data depends on whether they are using an application that requires them to access the metadata first. If they can’t see the metadata they may not get the opportunity to see the data. There is also the possibility they may or may not be able to bypass the metadata layer depending on how things have been configured. A good general answer but I would prefer at least a hint that there is more depth to this.
Next question, can it explain the identity hierarchy?
It starts off well, a hierarchy of user and groups playing a part in assigning metadata permissions and controlling access to metadata objects. It then goes off the rails by covering folders, confusing this with the separate inheritance path concept.
So next, the natural question was how does it differ from the inheritance path?
This is a good answer other than one aspect. It makes the classic mistake of suggesting tables and columns inherit from libraries but, putting that aside for now, the rest of the answer is good and clears up the confusion from the previous response.
Next I asked it specifically about how tables inherit permissions:
Trying to get it to answer a specific question about the mistake it made earlier and it walked away from the conversation! ;)
So I did try again later:
This confident answer is dead wrong! This time I provided some feedback:
This time, after correction, the response was correct.
I wondered if this correction would be remembered so asked the question again:
It provided a different answer that was still wrong. It was now suggesting it was some form of multiple inheritance.
Time to change the subject, Who can administer users and groups in SAS 9:
There is some truth to this answer in that it is roles (with their assigned capabilities and implied permissions) and, to some extent, explicit permissions that control who can administer users and groups, but I don’t know where it got those made-up permissions names from! I challenged it on this:
Next I challenged it on the distinction between a “SAS Administrators” group and role (a bit pedantic but I wanted to see how it responded):
A reasonable response, but it made up some roles:
I discovered that making things up whilst sounding confident, or hallucinations, are a known feature of ChatGPT and other AI systems. On their ChatGPT page at https://openai.com/blog/chatgpt/) OpenAI mention, in the Limitations section that “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.”
As I mentioned, way back in the beginning of this post, I was both impressed and disappointed. I grew up with computers in the early 1980’s so I can look back and see things like today’s everyday video calls as science-fiction-become-reality. I am excited by what advances the next few decades will bring. AI is certainly going to change the way we live, as it already does today. I do find it amazing that I can chat to a computer system about a specialist subject and get confident generally correct answers, at least at a high level. What disappoints me is the confident wrong answers. As a knowledgeable person in that area I know they are wrong. If I had less experience how would I know it was wrong when it sounds so right? Should or could its responses come with a confidence level? As a human we might say, “I don’t know the details yet”, or “I’m not 100% confident but I think …” to let the other person know they should independently look up, check and verify what we are about to say. I guess as with anything we read on the internet we need to be prepared that AI output may be wrong and should be verified from reliable sources. Of course, in this particular case we can check with the extensive detailed documentation available from SAS Institute, along with many years of SAS user group papers and presentations.
What are your experiences with ChatGPT? How do your experiences compare with mine above? Please leave a comment below and let me know.