package artifact
Provides classes for dealing with Stack Overflow artifacts.
It also provides means to serialize and deserialize them from JSON files.
Tutorial
The main class to consider is ch.usi.inf.reveal.parsing.artifact.StackOverflowArtifact, representing a Stack Overflow discussion.
The main way to obtain a Stack Overflow discussion is to deserialize it froma JSON file, using ArtifactSerializer.
import ch.usi.inf.reveal.parsing.artifact._ val jsonFilePath = /*path to json file*/ val serializer = new ArtifactSerializer() val discussion = serializer.deserializeFromFile(jsonFilePath)
You can now inspect the discussion, for example by getting the question (a StackOverflowQuestion object) or answers (a seq of StackOverflowAnswer):
scala> val question = discussion.question question: ch.usi.inf.reveal.parsing.artifact.StackOverflowQuestion = ... scala> val answers = discussion.answers answers: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowAnswer] = ...
The way to get the object representing the poster of the question (a StackOverflowUser) is the following:
scala> val posterUser = question.owner posterUser: Option[ch.usi.inf.reveal.parsing.artifact.StackOverflowUser] = Some(StackOverflowUser(...))
You can also easily get the comments (objects of StackOverflowComment), for example the ones posted for the question:
scala> val questionComments = question.comments questionComments: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowComment] = ...
The easy way to get all information units (ch.usi.inf.reveal.parsing.units.InformationUnit) and the HASTs (classes implementing the ch.usi.inf.reveal.parsing.model.HASTNode trait) for an element (question or answer or comment) is:
scala> val allUnits = discussion.units allUnits: Seq[ch.usi.inf.reveal.parsing.units.InformationUnit] = List(...) scala> val astNodes = allUnits.map { _.astNode } astNodes: Seq[ch.usi.inf.reveal.parsing.model.ASTNode] =
For further information about the information units, the HAST and the model, please check the documentation of ch.usi.inf.reveal.parsing.model and ch.usi.inf.reveal.parsing.units.
- Alphabetic
- By Inheritance
- artifact
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
trait
Artifact
extends JsonSerializable with Product
A software artifact, composed of information units.
-
trait
SourceInfo
extends JsonSerializable
A source info represents a binding between the information unit and the original contents
-
case class
StackOverflowAnswer
(id: Int, questionId: Int, comments: Seq[StackOverflowComment], creationDate: Date, communityOwnedDate: Option[Date], lastActivityDate: Date, lastEditDate: Option[Date], score: Int, isAccepted: Boolean, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowPost with Product with Serializable
A StackOverflow answer in a discussion.
A StackOverflow answer in a discussion.
- id
the answer id, as from the Stack Overflow database.
- questionId
the id of the question this answer was a reply of.
- comments
the set of comments posted to this answer.
- creationDate
the date this answer was posted.
- lastActivityDate
the last date this answer had an activity.
- lastEditDate
the last date this answer was edited.
- score
the score of this answer.
- isAccepted
true
if this was the accepted answer in the discussion.- owner
the user who posted the answer, if it is present in the Stack Overflow dump.
- informationUnits
the sequence of information units for this answer.
-
case class
StackOverflowArtifact
(question: StackOverflowQuestion, answers: Seq[StackOverflowAnswer]) extends Artifact with Product with Serializable
Represents a Stack Overflow artifact, i.e., a discussion.
Represents a Stack Overflow artifact, i.e., a discussion.
- question
the discussion's question.
- answers
the discussion's answers.
-
case class
StackOverflowComment
(id: Int, postId: Int, creationDate: Date, replyToUser: Option[StackOverflowUser], score: Int, isEdited: Boolean, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowElement with Product with Serializable
A Stack Overflow comment to a post.
A Stack Overflow comment to a post.
- id
the comment id, as from the dump.
- postId
the id of the post (question or answer) this comment was posted to.
- creationDate
the date this comment was created.
- replyToUser
optionally, the user on which this comment was a reply for.
- score
the score of this comment.
- isEdited
if the score has ever been edited.
- owner
the poster of this comment, if present.
-
trait
StackOverflowElement
extends JsonSerializable
A Stack Overflow element, that is, a question, answer or comment.
-
trait
StackOverflowPost
extends StackOverflowElement
A Stack Overflow post, that is, a question or an answer.
-
case class
StackOverflowQuestion
(id: Int, title: String, comments: Seq[StackOverflowComment], tags: Seq[String], creationDate: Date, lastActivityDate: Date, lastEditDate: Option[Date], communityOwnedDate: Option[Date], closedDate: Option[Date], closedReason: Option[String], score: Int, viewCount: Int, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowPost with Product with Serializable
A StackOverflow question in a discussion.
A StackOverflow question in a discussion.
- id
the question id, as from the Stack Overflow database.
- comments
the set of comments posted to this question.
- tags
a set of strings representing the tags for this question
- creationDate
the date this question was posted.
- lastActivityDate
the last date this question had an activity.
- lastEditDate
the last date this answer was edited, if it ever was.
- communityOwnedDate
the date on which this question was owned by the community, if present.
- closedDate
the date on which this question was closed, if it ever was.
- closedReason
a string representing the reason for which the question was closed, if defined.
- score
the score of this question.
- viewCount
the number of times this question was viewed.
- owner
the user who posted the question, if it is present in the Stack Overflow dump.
-
case class
StackOverflowUser
(id: Int, acceptRate: Option[Int], displayName: String, link: Option[String], profileImage: Option[String], reputation: Int, userType: Option[String]) extends JsonSerializable with Product with Serializable
A Stack Overflow user as present in the dump.
A Stack Overflow user as present in the dump.
- id
the id of the user as present in the dump.
- acceptRate
the rate of accepted answers for this user.
- displayName
the display name for the user.
- link
an optional URL for the user page.
- profileImage
an optional profile image URL for the user
- reputation
the user reputation.
- userType
an optional string representing the user type (one of unregistered, registered, moderator, or does_not_exist, if defined).
-
case class
XmlSourceInfo
(node: XmlElementNode) extends SourceInfo with Product with Serializable
A xml source info is a binding between a xml/html element representing an information unit and the information unit itself.
A xml source info is a binding between a xml/html element representing an information unit and the information unit itself.
- node
the original xml element representing the source for this information unit.
Value Members
-
object
ArtifactSerializer
Serializer/Deserializer utility for artifacts.
Serializer/Deserializer utility for artifacts.
This version works only for StackOverflowArtifact.
-
object
DateSerializer
extends CustomSerializer[Date]
A custom JSON serializer/deserializer for
java.util.Date
.A custom JSON serializer/deserializer for
java.util.Date
.Serializes (deserializes) dates as (from) JSON strings with the format "yyyy-MM-dd".